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TITLE 

DIPEPTIDYL PEPTIDASES 

FIELD OF INVENTION 

5 The invention relates to a dipeptidyl peptidase, to a 

nucleic acid molecule which encodes it, and to uses of the 
peptidase . 

BACKGROUND OF THE INVENTION 

10 The dipeptidyl peptidase (DPP) IV-like gene family is a 
family of molecules which have related protein structure 
and function [1-3] . The gene family includes the following 
molecules: DPPIV (CD26) , dipeptidyl amino-peptidase-like 
protein 6 (DPP6) , dipeptidyl amino-peptidase-like protein 8 

15 (DPP8) and fibroblast activation protein (FAP) [1,2,4,5] . 
Another possible member is DPPIV-P [6] . 

The molecules of the DPPIV-like gene family are serine 
proteases, they are members of the peptidase family S9b, 
20 and together with prolyl endopeptidase (S9a) and 

acylaminoacyl peptidase (S9c) , they are comprised in the 
prolyl oligopeptidase family [5, 7], 

DPPIV and FAP both have similar postproline dipeptidyl 
25 amino peptidase activity, however, unlike DPPIV, FAP also 
has gelatinase activity [8 , 9] . 

DPPIV substrates include chemokines such as RANTES, 
eotaxin, macrophage -derived chemokine and stromal -cell - 
30 derived factor 1; growth factors such as glucagon and 

glucagon-like peptides 1 and 2; neuropeptides including 
neuropeptide Y and substance P; and vasoactive peptides [10- 
12] . 

35 DPPIV and FAP also have non-catalytic activity; DPPIV binds 
adenosine deaminase, and FAP binds to a 3 Pi and a 5 Pi 
integrin [13-14] . 
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In view of the above activities, the DPPIV-like family 
members are likely to have roles in intestinal and renal 
handling of proline containing peptides, cell adhesion, 
peptide metabolism, including metabolism of cytokines, 
5 neuropeptides, growth factors and chemokines, and 
immunological processes, specifically T cell 
stimulation [3, 11, 12] . 

Consequently, the DPPIV-like family members are likely to 
10 be involved in the pathology of disease, including for 
example, tumour growth and biology, type II diabetes, 
cirrhosis, autoimmunity, graft rejection and HIV 
infection [3 , 15-18] . 

15 Inhibitors of DPPIV have been shown to suppress arthritis, 
and to prolong cardiac allograft survival in animal models 
in vivo[19,20]. Some DPPIV inhibitors are reported to 
inhibit HIV infection [21] . It is anticipated that DPPIV 
inhibitors will be useful in other therapeutic applications 

2 0 including treating diarrhoea, growth hormone deficiency, 
lowering glucose levels in non insulin dependent diabetes 
mellitus and other disorders involving glucose intolerance, 
enhancing mucosal regeneration and as 
immunosuppressants [3 , 21-24] . 

25 

There is a need to identify members of the DPPIV-like gene 
family as this will allow the identification of 
inhibitor (s) with specificity for particular family 
member (s), which can then be administered for the purpose 
30 of treatment of disease. Alternatively, the identified 
member may of itself be useful for the treatment of 
disease . 

SUMMARY OF THE INVENTION 



35 



The present invention seeks to address the above identified 
need and in a first aspect provides a peptide which 
comprises the amino acid sequence shown in SEQ ID NO: 2. 
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As described herein, the inventors believe that the peptide 
is a prolyl oligopeptidase and a dipeptidyl peptidase, 
because it has substantial and significant homology with 
the amino acid sequences of DPPIV and DPP8 . As homology is 
5 observed between DPP8, DPPIV and DPP9 , it will be 

understood that DPP9 has a substrate specificity for at 
least one of the following compounds: H-Ala-Pro-pNA, H-Gly- 
Pro-pNA and H-Arg-Pro-pNA. 

10 The peptide is homologous with human DPPIV and DPP8, and 
importantly, identity between the sequences of DPPIV and 
DPP8 and SEQ ID NO: 2 is observed at the regions of DPPIV 
and DPP8 containing the catalytic triad residues and the 
two glutamate residues of the P-propeller domain essential 

15 for DPPIV enzyme activity. The observation of amino acid 
sequence homology means that the peptide which has the 
amino acid sequence shown in SEQ ID NO: 2 is a member of the 
DPPIV-like gene family. Accordingly the peptide is now 
named and described herein as DPP9 . 

20 

The following sequences of the human DPPIV amino acid 
sequence are important for the catalytic activity of DPPIV: 
(i) Trp 617 GlyTrpSerTyrGlyGlyTyrVal; (ii) 

Ala 707 AspAspAsnValHisPhe; (iii) Glu 738 AspHisGlyIleAlaSer ; and 
25 (iv) Trp 201 ValTyrGluGluGluVal [25-28] . As described herein, 
the alignment of the following sequences of DPP9: 
His 833 GlyTrpSerTyrGlyGlyPheLeu; Leu 913 AspGluAsnValHisPhePhe ; 
Glu 944 ArgHisSerIleArg and Phe 350 ValIleGlnGluGluPhe with 
sequences (i) to (iv) above, respectively, suggests that 
30 these sequences of DPP9 are likely. to confer the catalytic* 
activity of DPP9. This is also supported by the alignment 
of DPP9 and DPP8 amino acid sequences. More specifically, 
DPP8 has substrate specificity for H-Ala-Pro-pNA, H-Gly- 
Pro-pNA and H-Arg-Pro-pNA, and shares near identity, with 
35 only one position of amino acid difference, in each of the 
above described sequences of DPP9 . Thus, in a second 
aspect, the invention provides a peptide comprising the 
following amino acid sequences: 
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HisGlyTrpSerTyrGlyGlyPheLeu; LeuAspGluAsnValHisPhePhe; 
GluArgHisSerlleArg and PheVallleGlnGluGluPhe ; which has the 
substrate specificity of the sequence shown in SEQ ID NO: 2. 

5 Also described herein, using the GAP sequence alignment 
algorithm, it is observed that DPP9 has 53% amino acid 
similarity and 29% amino acid identity with a C. elegans 
protein. Further, as shown herein, a nucleic acid molecule 
which encodes DPP9, is capable of hybridising specifically 

10 with DPP9 sequences derived from non-human species, 
including rat and mouse. Further, the inventors have 
isolated and characterised a mouse homologue of human DPP9 . 
Together these data demonstrate that DPP9 is expressed in 
non-human species. Thus in a third aspect, the invention 

15 provides a peptide which has at least 91% amino acid 

identity with the amino acid sequence shown in SEQ ID NO: 2, 
and which has the substrate specificity of the sequence 
shown in SEQ ID NO: 2. Typically the peptide has the 
sequence shown in SEQ ID NO:4. Preferably, the amino acid 

20 identity is 75%. More preferably, the amino acid identity 
is 95%. Amino acid identity is calculated using GAP 
software [GCG Version 8, Genetics Computer Group, Madison, 
WI, USA] as described further herein. Typically, the 
peptide comprises the following sequences : 

2 5 HisGlyTrpSerTyrGlyGlyPheLeu ; LeuAspGluAsnValHisPhePhe ; 
GluArgHisSerlleArg and PheVallleGlnGluGluPhe. 

In view of the homology between DPPIV, DPP8 and DPP9 amino 
acid sequences, it is expected that these sequences will 

30 have similar tertiary structure. This means that the 

tertiary structure of DPP9 is likely to include the seven- 
blade P~ propeller domain and the a/p hydrolase domain of 
DPPIV. These structures in DPP9 are likely to be conferred 
by the regions comprising p-propeller, Val 226 to Ala 705 , a/p 

35 hydrolase, Ser 706 to Leu 969 and about 70 to 90 residues in 
the region Ser 136 to Gly 225 . As it is known that the p- 
propeller domain regulates proteolysis mediated by the 
catalytic triad in the a/p hydrolase domain of prolyl 
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oligopeptidase, [29] it is expected that truncated forms of 
DPP9 can be produced, which have the substrate specificity 
of the sequence shown in SEQ ID NO: 2, comprising the 
regions referred to above (His 833 GlyTrpSerTyrGlyGlyPheLeu; 
5 Leu 913 AspGluAsnValHisPhePhe; Glu 944 ArgHisSerIleArg and 
Phe 350 ValIleGlnGluGluPhe) which confer the catalytic 
specificity of DPP9. Examples of truncated forms of DPP9 
which might be prepared are those in which the region 
conferring the (J-propeller domain and the a/p hydrolase 

10 domain are spliced together. Other examples of truncated 
forms include those that are encoded by splice variants of 
DPP9 mRNA. Thus although, as described herein, the 
biochemical characterisation of DPP9 shows that DPP9 
consists of 969 amino acids and has a molecular weight of 

15 about 110 kDa, it is recognised that truncated forms of 

DPP9 which have the substrate specificity of the sequence 
shown in SEQ ID NO: 2, may be prepared using standard 
techniques [30,31] . Thus in a fourth aspect, the invention 
provides a fragment of the sequence shown in SEQ ID NO: 2, 

2 0 which has the substrate specificity of the sequence shown 

in SEQ ID NO: 2. The inventors believe that a fragment from 
Serl36 to Leu969 (numbered according to SEQ ID NO: 2) would 
have enzyme activity. 

25 It is recognised that DPP9 may be fused, or in other words, 
linked to a further amino acid sequence, to form a fusion 
protein which has the substrate specificity of the sequence 
shown in SEQ ID NO:2. An example of a fusion protein is 
one which comprises the sequence shown in SEQ ID NO: 2 which 

30 is linked to a further amino acid sequence: a "tag" 

sequence which consists of an amino acid sequence encoding 
the V5 epitope and a His tag. An example of another 
further amino acid sequence which may be linked with DPP9 
is a glutathione S transferase (GST) domain [30] . Another 

3 5 example of a further amino acid sequence is a portion of 

CD8a [8] . Thus in one aspect, the invention provides a 
fusion protein comprising the amino acid sequence shown in 
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SEQ ID NO: 2 linked with a further amino acid sequence, the 
fusion protein having the substrate specificity of the 
sequence shown in SEQ ID NO:2. 

5 It is also recognised that the peptide of the first aspect 
of the invention may be comprised in a polypeptide, so that 
the polypeptide has the substrate specificity of DPP9. The 
polypeptide may be useful, for example, for altering the 
. protease susceptibility of DPP9, when used in in vivo 
10 applications. An example of a polypeptide which may be 
useful in this regard, is albumin. Thus in another 
embodiment, the peptide of the first aspect is comprised in 
a polypeptide which has the substrate specificity of DPP9 . 

15 In one aspect, the invention provides a peptide which 

includes the amino acid sequence shown in SEQ ID NO: 7. In 
one embodiment the peptide consists of the amino acid 
sequence shown in SEQ ID NO: 7. 

20 As described further herein, the amino acid sequence shown 
in SEQ ID NO: 7, and the amino acid sequences of DPPIV, DPP8 
and FAP are homologous. DPPIV, DPP8 and FAP have 
dipeptidyl peptidase enzymatic activity and have substrate 
specificity for peptides which contain the di -peptide 

25 sequence, Ala-Pro. The inventors note that the amino acid 
sequence shown in SEQ ID NO: 7 contains the catalytic triad, 
Ser-Asp-His. Accordingly, it is anticipated that the amino 
acid sequence shown in SEQ ID NO: 7 has enzymatic activity 
in being capable of cleaving a peptide which contains Ala-. 

30 Pro by hydrolysis of a peptide bond located C- terminal 
adjacent to proline in the di-peptide sequence. 

In one embodiment, the peptide comprises an amino acid 
sequence shown in SEQ ID NO: 7 which is capable of cleaving 
35 a peptide bond which is C-terminal adjacent to proline in 
the sequence Ala-Pro. The capacity of a dipeptidyl 
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peptidase to cleave a peptide bond which is C-terminal 
adjacent to proline in the di-peptide sequence Ala-Pro can 
be determined by standard techniques, for example, by 
observing hydrolysis of a peptide bond which is C-terminal 
5 adjacent to proline in the molecule Ala-Pro-p-nitroanilide . 

The inventors recognise that by using standard techniques 
it is possible to generate a peptide which is a truncated 
form of the sequence shown in SEQ ID NO: 7, which retains 

10- the proposed enzymatic activity described above. An 

example of a truncated form of the amino acid sequence 
shown in SEQ ID NO: 7 which retains the proposed .enzymatic 
activity is a form which includes the catalytic triad, Ser- 
Asp-His. Thus a truncated form may consist of less than 

15 the 831 amino acids shown in SEQ ID NO: 7. Accordingly, in 
a further embodiment, the peptide is a truncated form of 
the peptide shown in SEQ ID NO: 7, which is capable of 
cleaving a peptide bond which ^is C-terminal adjacent to 
proline in the sequence Ala-Pro. 

20 

It will be understood that the amino acid sequence shown in 
SEQ ID NO: 7 may be altered by one or more amino acid 
deletions, substitutions or insertions of that amino acid 
sequence and yet retain the proposed enzymatic activity 

25 described above. It is expected that a peptide which is at 
least 47% similar to the amino acid sequence of SEQ ID 
NO: 7, or which is at least 27% identical to the amino acid 
sequence of SEQ ID NO: 7, will retain the proposed enzymatic 
activity described above. The % similarity can be 

30 determined by use of the program/algorithm "GAP" which is 
available from Genetics Computer Group (GCG) , Wisconsin. 
Thus in another embodiment of the first aspect, the peptide 
has an amino acid sequence which is at least 47% similar to 
the amino acid sequence shown in SEQ ID NO: 7, and is 

35 capable of cleaving a peptide bond which is C-terminal 
adjacent to proline in the sequence Ala-Pro. 
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As described above, the isolation and characterisation of 
DPP9 is necessary for identifying inhibitors of DPP9 
catalytic activity, which may be useful for the treatment 
5 of disease. Accordingly, in a fifth aspect, the invention 
provides a method of identifying a molecule capable of 
inhibiting cleavage of a substrate by DPP9, the method 
comprising the following steps: 

(a) contacting DPP9 with the molecule; 
10 (b) contacting DPP9 of step (a) with a substrate 

capable of being cleaved by DPP9, in conditions sufficient 
for cleavage of the substrate by DPP 9 ; and 

(c) detecting substrate not cleaved by DPP9 , to 
identify that the molecule is capable of inhibiting 
15 cleavage of the substrate by DPP9 . 

It is recognised that although inhibitors of DPP9 may also 
inhibit DPPIV and other serine proteases, as described 
herein, the alignment of the DPP9 amino acid sequence with 

20 most closely related molecules, (i.e. DPPIV), reveals that 
the DPP9 amino acid is distinctive, particularly at the 
regions controlling substrate specificity. Accordingly, it 
is expected that it will be possible to identify inhibitors 
which inhibit DPP9 catalytic activity specifically, which 

25 do not inhibit catalytic activity of DPPIV-like gene family 
members, or other serine proteases. Thus, in a sixth 
aspect, the invention provides a method of identifying a 
molecule capable of inhibiting specifically, the cleavage 
of a substrate by DPP9, the method comprising the following 

3 0 steps : 

(a) contacting DPP9 and a further protease with the 
molecule; 

(b) contacting DPP9 and the further protease of step 
(a) with a substrate capable of being cleaved by DPP9 and 

35 the further protease, in conditions sufficient for cleavage 
of the substrate by DPP9 and the further protease; and 
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(c) detecting substrate not cleaved by DPP9, but 
cleaved by the further protease, to identify that the 
molecule is capable of inhibiting specifically, the 
cleavage of the substrate by DPP9. 

5 

In a seventh aspect, the invention provides a method of 
reducing or inhibiting the catalytic activity of DPP9, the 
method comprising the step of contacting DPP9 with an 
inhibitor of DPP9 catalytic activity. In view of the 

10 homology between DPP9 and DPP8 amino acid sequences, it 

will be understood that inhibitors of DPP8 activity may be 
useful for inhibiting DPP9 catalytic activity. Examples of 
inhibitors suitable for use in the seventh aspect are 
described in [21,32,33]. Other inhibitors useful for 

15 inhibiting DPP9 catalytic activity can be identified by the 
methods of the' fifth or sixth aspects of the invention. 

In one embodiment, the catalytic activity of DPP9 is 
reduced or inhibited in a mammal by administering the 

20 inhibitor of DPP9 catalytic activity to the mammal. It is 
recognised that these inhibitors have been used to reduce 
or inhibit DPPIV catalytic activity in vivo, and therefore, 
may also be used for inhibiting DPP9 catalytic activity in 
vivo. Examples of inhibitors useful for this purpose are 

25 disclosed in the following [21,32-34] . 

Preferably, the catalytic activity of DPP9 in a mammal is 
reduced or inhibited in the mammal, for the purpose of 
treating a disease in the mammal. Diseases which are 
3 0 likely to be treated by an inhibitor of DPP9 catalytic 

activity are those in which DPPIV- like gene family members 
. are associated [3,10,11,17,21,36], including for example, 
neoplasia, type II diabetes, cirrhosis, autoimmunity, graft 
rejection and HIV infection. 

35 

Preferably, the inhibitor for use in the seventh aspect of 
the invention is one which inhibits the cleavage of a 
peptide bond C-terminal adjacent to proline. As described 
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herein, examples of these inhibitors are 4- (2- 
aminoethyl) benzenesulf onylf luoride, aprotinin, 
benzamidine/HCl, Ala-Pro-Gly, H-Lys-Pro-OH HCl salt and 
zinc ions, for example, zinc sulfate or zinc chloride. More 
5 preferably, the inhibitor is one which specifically 
inhibits DPP9 catalytic activity, and which does not 
inhibit the catalytic activity of other serine proteases, 
including, for example DPPIV, DPP8 or FAP. 

10 In an eighth aspect, the invention provides a method of 
cleaving a substrate which comprises contacting the 
substrate with DPP9 in conditions sufficient for cleavage 
of the substrate by DPP9, to cleave the substrate. 
Examples of molecules which can be cleaved by the method 

15 are H-Ala-Pro-pNA, H-Gly-Pro-pNA and H-Arg-Pro-pNA. 

Molecules which are cleaved by DPPIV including RANTES, 
eotaxin, macrophage -derived chemokine, stromal-cell-derived 
factor 1,. glucagon and glucagon-like peptides 1 and 2, 
neuropeptide Y, substance P and vasoactive peptide are also 

20 likely to be cleaved by DPP9 [11,12] . In one embodiment, 
the substrate is cleaved by cleaving a peptide bond C- 
terminal adjacent to proline in the substrate. The 
molecules cleaved by DPP9 may have Ala, or Trp, Ser, Gly, 
Val or Leu in the PI position, in place of Pro [11,12] 

25 

The inventors have characterised the sequence of a nucleic 
acid molecule which encodes the amino acid sequence shown 
in SEQ ID NO: 2. Thus in a tenth aspect, the invention 
provides a nucleic acid molecule which encodes the amino 
3 0 acid sequence shown in SEQ ID NO: 2. 



In an eleventh aspect, the invention provides a nucleic 
acid molecule which consists of the sequence shown in SEQ 
ID NO:l. 
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In another aspect, the invention provides a nucleic acid 
molecule which encodes a peptide comprising the amino acid 
sequence shown in SEQ ID NO: 7. 

5 The inventors have characterised the nucleotide sequence of 
the nucleic acid molecule encoding SEQ ID NO: 7. The 
nucleotide sequence of the nucleic acid molecule encoding 
DPP4-like-2 is shown in SEQ ID NO:8. Thus, in one 
embodiment, the nucleic acid molecule comprises the 
10 nucleotide sequence shown in SEQ ID NO : 8 . In another 
embodiment, the nucleic acid molecule consists of the 
nucleotide sequence shown in SEQ ID NO: 8. 

The inventors recognise that a nucleic acid molecule which 
15 has the nucleotide sequence shown in SEQ ID NO: 8 could be 
made by producing only the fragment of the nucleotide 
sequence which is translated. Thus in an embodiment, the 
nucleic acid molecule does not contain 5' or 3 ' 
untranslated nucleotide sequences . 

20 

As described herein, the inventors observed RNA of 4.4 kb 
and a minor band of 4 . 8 kb in length which hybridised to 
a nucleic acid molecule comprising sequence shown in SEQ ID 
NO: 8. It is possible that these mRNA species are splice 
25 variants. Thus in another embodiment, the nucleic acid 

molecule comprises the nucleotide sequence shown in SEQ ID 
NO: 8 and which is approximately 4 . 4 kb or 4 . 8 kb in length. 

In another embodiment, the nucleic acid molecule is 
30 selected from the group of nucleic acid molecules 

consisting of DPP4-like-2a, DPP4-like~2b and DPP4-like-2c, 
as shown in Figure 2 . 

In another aspect, the invention provides a nucleic acid 
35 molecule having a sequence shown in SEQ ID NO: 3. 
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In a twelfth aspect, the invention provides a nucleic acid 
molecule which is capable of hybridising to a nucleic acid 
molecule consisting of the sequence shown in SEQ ID NO:l in 
5 stringent conditions, and which encodes a peptide which has 
the substrate specificity of the sequence shown in SEQ ID 
NO:2. As shown in the Northern blot analysis described 
herein, DPP9 mRNA hybridises specifically to the sequence 
shown in SEQ ID NO:l, after washing in 2XSSC/ 1.0%SDS at 

10 37°C, or after washing in 0.1XSSC/0.1% SDS at 50°C. 

"Stringent conditions" are conditions in which the nucleic 
acid molecule is exposed to 2XSSC/ 1.0% SDS. Preferably, 
the nucleic acid molecule is capable of hybridising to a 
molecule consisting of the sequence shown in SEQ ID NO:l in 

15 high stringent conditions. "High stringent conditions" are 
conditions in which the nucleic acid molecule is exposed to 
0.1XSSC/ 0.1%SDS at 50°C. 

As described herein, the inventors believe that the gene 
2 0 which encodes DPP9 is located at band p!3.3 on human 
chromosome 19. The location of the DPP9 gene is 
distinguished from genes encoding other prolyl 
oligopeptidases, which are located on chromosome 2, at 
bands 2q24.3 and 2q23, chromosome 7 or chromosome 15q22 . 
25 Thus in an embodiment, the nucleic acid molecule is one 

capable of hybridising to a gene' which is located at band 
pl3.3 on human chromosome 19. 

It is recognised that a nucleic acid molecule which encodes 
30 the amino acid sequence shown in SEQ ID NO: 2, or which 

comprises the sequence shown in SEQ ID NO:l, could be made 
by producing the fragment of the sequence which is 
translated, using standard techniques [3.0,31]. Thus in an 
embodiment, the nucleic acid molecule does not contain 5' 
35 or 3 7 untranslated sequences. 
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In a thirteenth aspect, the invention provides a vector 
which comprises a nucleic acid molecule of the tenth aspect 
of the invention. In one embodiment, the vector is capable 
of replication in a COS-7 cell, CHO cell or 293T cell, or 
5 E.coli. In another embodiment, the vector is selected from 
the group consisting of VTripleEx, pTripleEx, pGEM-T Easy- 
Vector, pSecTag2Hygro, petlSb, pEE14 .HCMV. gs and 
pCDNA3 . 1/V5/His. 

10 In a fourteenth aspect, the invention provides a cell which 
comprises a vector of the thirteenth aspect of the 
invention. In one embodiment, the cell is an E.coli cell. 
Preferably, the E. coli is MC1061, DH5a, JM109, BL21DE3, 
pLysS. In another embodiment, the cell is a COS-7, COS-1, 

15 293T or CHO cell. 

In a fifteenth aspect, the invention provides a method for 
making a peptide of the first aspect of the invention 
comprising, maintaining a cell according to the fourteenth 

2 0 aspect of the invention in conditions sufficient for 

expression of. the peptide by the cell . The conditions 
sufficient for expression are described herein. In one 
embodiment, the method comprises the further step of 
isolating the peptide. 

25 

In a sixteenth aspect, the invention provides a peptide 
when produced by the method of the fifteenth aspect. 

In a seventeenth aspect, the invention provides a 

3 0 composition comprising a peptide- of the first aspect and a 

pharmaceutically acceptable carrier. 

In an eighteenth aspect, the invention provides an antibody 
which is capable of binding a peptide according to the 
35 first aspect of the invention. The antibody can be 
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prepared by immunising a subject with purified DPP9 or a 
fragment thereof according to standard techniques [35] . An 
antibody may be prepared by immunising with transiently 
transfected DPP9 + cells. It is recognised that the 
5 antibody is useful for inhibiting activity of DPP9 . In one 
embodiment, the antibody of the eighteenth aspect of the 
invention is produced by a hybridoma cell. 

In a nineteenth aspect, the invention provides a hybridoma 
10 cell which secretes an antibody of the nineteenth aspect. 



Figure 1. Nucleotide sequence of DPP8 (SEQ ID NO: 5) . 
Figure 2. Schematic representation of the cloning of human 
15 cDNA DPP9. 

Figure 3. Schematic representation of the assembly of 
nucleotide sequences of human cDNA DPP9. 

Figure 4. Nucleotide sequence of human cDNA DPP9 (SEQ ID 
NO:l) and amino acid sequence of human DPP9 (SEQ ID NO: 2) . 
20 Figure 5. Alignment of human DPP9 amino acid sequences 
with the amino acid sequence encoded by a predicted open 
reading frame of GDD. 

Figure 6. Alignment of human DPP8, DPP9, DPP4 and FAP 
amino acid sequences. 
25 Figure 7. Northern blot analysis of human DPP9 RNA. 

Figure 8. Alignment of murine (SEQ ID NO:4) and human DPP9 
amino acid sequences. 

Figure 9. Alignment of murine (SEQ ID NO: 3) and human DPP9 
cDNA nucleotide sequences. 
30 Figure 10. Northern blot analysis of rat DPP9 RNA. 
Figure 11. Detection of DPP9 cDNA in CEM cells. 
Figure 12. Detection of murine DPP9 nucleotide sequence. 
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DETAILED DESCRIPTION OF THE INVENTION 

EXAMPLES 
General 

Restriction enzymes and other enzymes used in cloning were 
5 obtained from Boehringer Mannheim Roche. Standard molecular 
biology techniques were used unless indicated otherwise. 

DPP 9 Cloning 

The nucleotide sequence of DPP8 shown in Figure 1 was used 
10 to search the GenBank database for homologous nucleotide 
sequences. Nucleotide sequences referenced by GenBank 
accession, numbers AC005594 and AC005783 were detected and 
named GDD. The GDD nucleotide sequence is 39.5 kb and has 
19 predicted exons . The analysis of the predicted exon- 
15 intron boundaries in GDD suggests that the predicted open 
reading frame of GDD is 3.6 kb in length. 

In view of the homology of DPP8 and the GDD nucleotide 
sequences, we hypothesised the existence of DPPIV-like 
20 molecules other than DPP8 . We used oligonucleotide primers 
derived from the nucleotide sequence of GDD and reverse 
transcription PCR (RT-PCR) to isolate a cDNA encoding 
DPPIV-like molecules. 

25 RT-PCR amplification of human liver RNA derived from a pool 
of 4 patients with autoimmune hepatitis using the primers 
GDD pr IF and GDD pr 1R (Table 1) produced a 500 base pair 
product. This suggested that DPPIV-like molecules are 
likely to be expressed in liver cells derived from 

.30 individuals with autoimmune hepatitis and that RNA derived 
from these cells is likely to be a suitable source for 
isolating cDNA clones encoding DPPIV-like molecules. 

Primers GDD pr 3F and GDD pr 1R (Table 1) were then used to 
35 isolate a cDNA clone encoding a DPP4-like molecule. A 1.6 
kb fragment was observed named DPP4-like-2a. Primers GDD 
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pr 15F and GDD pr 7R (Table 1) were then used to isolate a 
cDNA clone encoding a DPP4-like molecule. A 1.9 kb product 
was observed and named DPP4-like-2b. As described further 
herein, the sequence of DPP4-like-2b overlaps with the 
5 sequence of DPP4-like-2a . 

The DPP4-like-2a and 2b fragments were gel purified using 
WIZARD® PCR preps kit and cloned into the pGEM®-T-easy 
plasmid vector using the EcoRI restriction sites. The 

10 ligation reaction was used to transform JM109 competent 
cells. The plasmid DNA was prepared by miniprep. The 
inserts were released by EcoRI restriction digestion. The 
DNA was sequenced in both directions using the M13Forward 
and M13Reverse sequencing primers. The complete sequence 

15 of DPP4-like-2a and 2b fragments was derived by primer 
walking. 

The nucleotide sequence 5' adjacent to DPP4-like-2b was 
obtained by 5' RACE using dC tailing and the gene specific 

20 primers GDD GSP1 . 1 and 2.1 (Table 1). A fragment of 500 
base pairs (DPP4-like-2c) was observed. The fragment was 
gel purified using WIZARD® PCR preps kit and cloned into 
the pGEM®-T-easy plasmid vector using the EcoRI restriction 
sites. The ligation reaction was used to transform JM109 

25 competent cells. The plasmid DNA was prepared by miniprep. 
The inserts were released by EcoRI restriction digestion. 
The DNA was sequenced in both directions using the 
M13Forward and M13Reverse sequencing primers. 

.30 We identified further sequences, BE727051 and BE244612, 

with identity to the 5' end of DPP9 . These were discovered 
while performing BLASTn with the 5' end of the DPP9 
nucleotide sequence. BE727051 contained further 5' sequence 
for DPP9, which was also present in the genomic sequence 
35 for DPP9 on chromosome 19pl3.3. This was used to design 
primer DPP9-22F ( 5 1 GCCGGCGGGTCCCCTGTGTCCG3 1 ) . Primer 22F 
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was used in conjunction with primer GDD3 ' end 
( 5 ' GGGCGGGACAAAGTGC CTCACTGG3 ' ) on cDWA made from the human 
CEM cell line to produce a 3000bp product as expected 
Figure 11. 



Nucleotide sequence analysis of DPP4-like-2a, 2b, and 2c 
fragments . 

An analysis of the nucleotide sequence of fragments DPP4- 
like 2a, 2b and 2c with the Sequencher™ version 3.0 
10 computer program (Figure 3), and the 5' fragment isolated 
by primers DPP9-22F and GDD3'end, revealed the nucleotide 
sequence shown in Figure 4 . 

The predicted amino acid sequence shown in Figure 4 was 

15 compared to a predicted amino acid sequence encoded by a 
predicted open reading frame of GDD (predicted from the 
nucleotide sequence referenced by GenBank Accession Nos . 
AC005594 and AC005783) , to determine the relatedness of the 
nucleotide sequence of Figure 4 to the nucleotide sequence 

20 of the predicted open reading frame of GDD (Figure 5) . 
Regions of amino acid identity were observed suggesting 
that there may be regions of nucleotide sequence identity 
of the predicted open reading frame of GDD and the sequence 
of Figure 4, However, as noted in Figure 5, there are 

25 regions of amino acid sequence encoded by the sequence of 
Figure 4 and the amino acid sequence encoded by the 
predicted open reading frame of GDD which are not 
identical, demonstrating that the nucleotide sequences 
encoding the predicted open reading frame of GDD and the 

3 0 sequence shown in Figure 4 are different nucleotide 
sequences . 

As described further herein, the predicted amino acid 
sequence encoded by the cDNA sequence shown in Figure 4 is 
35 homologous to the amino acid sequence of DPP8 (Figure 6) . 
Accordingly, and as a cDNA consisting of the nucleotide 



5 
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sequence shown in Figure 4 was not known, the sequence 
shown in Figure 4 was named cDNA DPP9 . 

The predicted amino acid sequence encoded by cDNA DPP9 
5 (called DPP9) is 969 amino acids and is shown in Figure 4. 
The alignment of DPP9 and DPP8 amino acid sequences 
suggests that the nucleotide sequence shown in Figure 4 may- 
be a partial length clone. Notwithstanding this point, as 
discussed below, the inventors have found that the 

10 alignment of DPP9 amino pLcid sequence with the amino acid 
sequences of DPP8, DPP4 and FAP shows that DPP9 comprises 
sequence necessary for providing enzymolysis and utility. 
In view of the similarity between DPP9 and DPP8, a full 
length clone may be of the order of 882 amino acids. A 

15 full length clone could be obtained by standard techniques, 
including for example, the RACE technique using an 
oligonucleotide primer derived from the 5' end of cDNA 
DPP9. 

20 In view of the homology between the DPP8 and DPP9 amino 
acid sequences, it is likely that cDNA DPP9 encodes an 
amino acid sequence which has dipeptidyl peptidase 
enzymatic activity. Specifically, it is noted that the 
DPP9 amino acid sequence contains the catalytic triad Ser- 

25 Asp-His in the order of a non-classical serine protease as 
required for the charge relay system. The serine 
recognition site characteristic of DPP4 and DPP4-like 
family members, GYSWGG, surrounds the serine residue also 
suggesting that DPP9 cDNA will encode a DPP4-like enzyme 

3 0 activity. 

Further, DPP9 amino acid sequence also contains the two 
glutamic acid residues located at positions 205 and 206 in 
DPPIV. These are believed to be essential for the 
35 dipeptidyl peptidase enzymatic activity. By sequence 

alignment with DPPIV, the residues in DPP8 predicted to 
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play a pivotal role in the pore opening mechanism in Blade 
2 of the propeller are E 259 , E 260 . These are equivalent to 
the residues Glu 205 and Glu 20 * in DPPIV which previously have 
been shown to be essential for DPPIV enzyme activity. A 
5 point mutation Glu2 59Lys was made in DPP8 cDNA using the 
Quick Change Site directed Mutagenesis Kit ( Stratagene, La 
Jolla) . COS-7 cells transfected with wildtype DPP8 cDNA 
stained positive for H-Ala-Pro4MbNA enzyme activity while 
the mutant cDNA gave no staining. Expression of DPP8 

10 protein was demonstrated in COS cells transfected with 
wildtype and mutant cDNAs by immunostaining with anti-V5 
mAB. This mAB detects the V5 epitope that has been tagged 
to the C- terminus of DPP8 protein. Point mutations were 
made to each of the catalytic residues of DPP8, Ser73 9A, 

15 Asp817Ala and His849Ala, and each of these residues were 
also determined to be essential for DPP8 enzyme activity. 
In summary, the residues that have been shown 
experimentally to be required for enzyme activity in DPPIV 
and DPP8 are present in the DPP9 amino acid sequence: 

2 0 Glu 354 , Glu 355 , Ser 836 , Asp 914 and His 946 . 

The DPP9 amino acid sequence shows the closest relatedness 
to DPP8, having 77% amino acid similarity and 60% amino 
acid identity. The relatedness to DPPIV is 25% amino acid 
25 identity and 47% amino acid similarity. The % similarity 
was determined by use of the program/ algorithm "GAP" which 
is available from Genetics Computer Group (GCG) , Wisconsin. 

DPP9 mRNA Expression Studies 
30 DPP4-like-2a was used to probe a Human Master RNA Blot™ 
(CLONTECH Laboratories Inc., USA) to study DPP9 tissue 
expression and the relative levels of DPP9 mRNA expression. 

The DPP4-like-2a fragment hybridised to all tissue mRNA 
samples on the blot. The hybridisation also indicated high 
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levels of DPP9 expression in most of the tissues samples 'on 
the blot (data not shown) . 



The DPP4-like-2a fragment was then used to probe two 
5 Multiple Tissue Northern Blots™ (CLONTECH Laboratories 

Inc., USA) to examine the mRNA expression and to determine 
the size of DPP9 mRNA transcript. 

The autoradiographs of the DPP9 Multiple Tissue Northern 
10 blot are shown in Figure 8. The DPP9 transcript was seen in 
all tissues examined confirming the results obtained from 
the Master RNA blot. A single major transcript 4.4 kb in 
size was seen in all tissues represented on two Blots after 
16 hours of exposure. Weak bands could also be seen in some 
15 tissues after 6 hours of exposure. The DPP9 transcript was 
smaller than the 5 . 1 kb mRNA transcript of DPP8 . A minor, 
very weak transcript 4.8 kb in size was also seen in the 
spleen, pancreas, peripheral blood leukocytes and heart. 
The highest mRNA expression was observed in the spleen and 
20 heart. Of all tissues examined the thymus had the least 
DPP9 mRNA expression. The Multiple Tissue Northern Blots 
were also probed with a P -actin positive control. A 2.0 kb 
band was seen in all tissues. In addition as expected a 1.8 
kb (5-actin band was seen in heart and skeletal muscle. 

25 

Rat DPP9 expression 

A Rat Multiple Tissue Northern Blot (CLONTECH Laboratories, 
Inc., USA;catalogue #: 7764-1) was hybridised with a human 
DPP9 radioactively • labeled probe, made using Megaprime DNA 
30 Labeling kit and [32P] dCTP (Amersham International pic, 

Amersham, UK) . The DPP9 PCR product used to make the probe 
was generated using Met3F (GGCTGAGAG GAT GGCCACCAC CGGG) as 
the forward primer and GDD 3' end ( GGGCGGGACAAAGTGC 
CTCACTGG) as the reverse primer. The hybridisation was 
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carried out according to the manufacturers 1 instructions at 
60° C to detect cross -species hybridisation. After 
overnight hybridization the blot was washed at room 
temperature (2x SSC, 0.1% SDS) then at 40° C (O.lxSSC, 
5 0.1%SDS) . 

The human cDNA probe identified two bands in all tissues 
examined except in testes. A major transcript of 4 kb in 
size was seen in all tissues except testes. This 4 kb 

10 transcript was strongly expressed in the liver, heart and 
brain. A second weaker transcript 5.5 kb in size was 
present in all tissues except skeletal muscle and testes. 
However in the brain the 5 . 5kb transcript was expressed at 
a higher level than the 4.4 kb transcript. In the testes 

15 only one transcript approximately 3.5 kb in size was 

detected. Thus, rat DPP9 mRNA hybridised with a human DPP9 
probe indicating significant homology between DPP9 of the 
two species. The larger 5.5 kbtranscript observed may be 
due to crosshybridisation to rat DPP8 . 

20 

Mouse DPP9 expression 

A Unigene cluster for Mouse DPP9 was identified (UniGene 
Cluster Mm. 33185) by homology to human DPP9. An analysis of 

25 expressed sequence tags contained in this cluster and mouse 
genomic sequence (AC026385) for Chromosome 17 with the 
Sequencher™ version 3 . 0 computer program revealed the 
nucleotide sequence shown in Figure 9. This 3517bp cDNA 
encodes a 869 aa mouse DPP9 protein (missing N-terminus) 

.30 with 91% amino acid identity and 94 % amino acid similarity 
to human DPP9. The mouse DPP9 amino acid sequence also has 
the residues required for enzyme activity, Ser, Asp and His 
and the two Glu residues. 

35 The primers mgdd-prlF ( 5 ' ACCTGGGAGGAAGCACCCCACTGTG3 1 ) and 
mgdd-pr4R (5 1 TTCCACCTGGTCCTCAATCTCC3 ' ) were designed from 
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this sequence and used to amplify a 452 bp product as 
expected from liver mouse cDNA, as described below. 



RNA preparation 
5 B57B16 mice underwent carbon tetrachloride treatment to 

induce liver fibrosis. Liver RNA were prepared from snap- 
frozen tissues using the TRIzol® Reagent and other standard 
methods . 
cDNA synthesis 

10 2(Lig of liver RNA was reverse -transcribed using Superscript 
II RNase H- Reverse Transcriptase (Gibco BRL) . 
PCR 

PCR using mDPP9- IF ( ACCTGGGAGGAAGCACCCCACTGTG) as the 
forward primer and mDPP9-2R ( CTCTCCACATGC AGGGCTACAGAC ) as 

15 the reverse primer was used to synthesise a 550 base pair 
mouse DPP9 fragment. The PCR products were generated using 
AmpliTaq Gold® DNA Polymerase. The PCR was performed as 
follows: denaturation at 95° C for 10 min, followed by 35 
cycles of denaturation at 95° C for 30 seconds, primer 

2 0 annealing at 60 ° C for 3 0 seconds, and an extension 72° C 
for 1 min. 
Southern Blot 

DPP9 PCR products from six mice as well as the largest 
human DPP9 PCR product were run on a 1% agarose gel. The 

25 DNA on the gel was then denatured using 0.4 M NaOH and 
transferred onto a Hybond-N+ membrane (Amersham 
International pic, Amersham, UK) . The largest human DPP9 
PCR product was radiolabeled using the Megaprime DNA 
Labeling kit and [32 p ] dCTP (Amersham International pic, 

30 Amersham, UK) . Unincorporated label was removed using a NAP 
column (Pharmacia Biotech, Sweden) and the denatured probe 
was incubated with the membrane for 2 hours at 60° C in 
Express Hybridisation solution (CLONTECH Laboratories, 
Inc., USA) . (Figure 12) . Thus, DPP9 mRNA of appropriate 

35 size was detected in fibrotic mouse liver using rt-PCR. 

Furthermore, the single band of mouse DPP9 cDNA hybridised 
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with a human DPP9 probe indicating significant homology 
between DPP9 of the two species. 
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CLAIMS 

1. A peptide which comprises: 

5 (a) the sequence shown in SEQ ID NO:2; or 

(b) the amino acid sequences: 
His 833 GlyTrpSerTyrGlyGlyPheLeu; Leu 913 AspGluAsnValHisPhePhe ; 
Glu 944 ArgHisSerIleArg and Phe 350 ValIleGlnGluGluPhe, and which 
has the substrate specificity of the sequence shown in SEQ 

10 ID NO:2;or 

(c) the sequence which has at least 60% identity with 
the sequence shown in SEQ ID NO:2, and which has the 
substrate specificity of the sequence shown in SEQ ID NO:2; 
or 

15 (d) the sequence shown in SEQ ID NO:4. 

2. A peptide according to claim 1 (c) , wherein the 
amino acid identity is at least 75%. 

20 3. A peptide according to claim 1 (c) wherein the 

amino acid identity is at least 95%. 

4 . A . fragment of the sequence shown in SEQ ID NO : 2 
which has the substrate specificity of the sequence shown 

25 in SEQ ID NO:2. 

5. A fragment according to claim 4 which comprises 
part of the sequence shown in SEQ ID NO : 2 . 

30 6. A fusion protein comprising the amino acid 

sequence shown in SEQ ID NO: 2 linked with a further amino 
acid sequence, the fusion protein having the substrate 
specificity of the sequence shown in SEQ ID NO:2. 



35 7. A fusion protein according to claim 6 wherein the 

further amino acid sequence is selected from the group 
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consisting of GST, V5 epitope and His tag. 



8. A method of identifying a molecule capable of 
inhibiting cleavage of a substrate by DPP9 comprising the 
5 following steps: 

(a) contacting DPP9 with the molecule ; 

(b) contacting DPP9 of step (a) with a substrate 
capable of being cleaved by DPP9, in conditions sufficient 
for cleavage of the substrate by DPP 9 ; and 

10 (c) detecting substrate not cleaved by DPP9 , to 

identify that the molecule is capable of inhibiting 
cleavage of the substrate by DPP9. 



9. A method of identifying a molecule capable of 
15 inhibiting specifically, the cleavage of a substrate by 
DPP9, the method comprising the following steps: 

(a) contacting DPP9 and a further protease with the 
molecule; 

(b) contacting DPP9 and the further protease of step 
2 0 (a) with a substrate capable of being cleaved by DPP9 and 

the further protease, in conditions sufficient for cleavage 
of the substrate by DPP9 and the further protease; and 
(c) detecting substrate not cleaved by DPP9, but 
cleaved by the further protease, to identify that the 
25 molecule is capable of inhibiting specifically, the 
cleavage of the substrate by DPP9. 



10. A method of reducing or inhibiting the catalytic 
activity of DPP9, the method comprising the step of 

3.0 . contacting DPP9 with an inhibitor of DPP9 catalytic 
activity. 

11. A method of cleaving a substrate comprising the 
step of contacting the substrate with DPP9 in conditions 

35 sufficient for cleavage of the substrate by DPP9. 
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12. A nucleic acid molecule which: 

(a) encodes the sequence shown in SEQ ID NO: 2; or 

(b) consists of the sequence shown in SEQ ID NO:l; or 

(c) is capable of hybridizing to a nucleic acid 

5 molecule consisting of the sequence shown in SEQ ID NO:l in 
stringent conditions, and which encodes a peptide which has 
the substrate specificity of the sequence shown in SEQ ID 
NO : 2 ; or 

(d) consists of the sequence shown in SEQ ID NO: 3. 

10 

13. A nucleic acid molecule according to claim 12 (c) 
wherein the molecule is capable of hybridising in high 
stringent conditions. 

15 14 . A nucleic acid molecule according to claim 12 

which is capable of hybridising to a gene which is located 
at band pl3.3 on human chromosome 19. 

15. A nucleic acid molecule according to claim 12 
2 0 which does not contain 5' or 3' untranslated regions. 

16. A fragment of a nucleic acid molecule consisting 
of the sequence shown in SEQ ID NO:l, which encodes a 
peptide which has the substrate specificity of the sequence 

25 shown in SEQ ID NO:2. 

17. A fragment according to claim 16 which consists 
of part of the sequence shown in SEQ ID NO:l. 

30 18. A vector comprising a nucleic acid molecule 

according to claim 12. 

19. A cell comprising a vector according to claim 18. 

35 20. A composition comprising a peptide according to 

claim 1 . 
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31 

21. An antibody which is capable of binding to a 
peptide according to claim 1 . 

5 22 . An antibody according to claim 21 which is 

produced by a hybridoma cell. 

23. A hybridoma cell capable of making an antibody 
according to claim 22. 

10 . 

24. A peptide comprising the sequence shown in SEQ ID 
NO: 7. 

25. A nucleic acid molecule comprising the sequence 
15 shown in SEQ ID NO: 8. 
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10 30 50 

1 CGGCGGGTCCCCTGTGTCCGCCGCGGCTGTCGTCCCCCGCTCCCGCCACTTCCGGGGTCG 6 0 

1RRVPCVRRGCRPPLPPLPGS 20 

70 90 110 

6 1 CAGTCCCGGGCATGGAGCCGCGACCGTG AGGCGCCGCTGGACCCGGGACGACCTGCCCAG 12 0 

21QSRAWS RDREAPLDPGRPAQ 40 

130 150 170 

121 TCCGGCCGCCGCCCCACGTCCCGGTCTGTGTCCCACGCCTGCAGCTGGAATGGAGGCTCT 180 

41SGRRPTSRSVSHACSWNGGS 60 

190 210 230 

181 CTGGACCCTTTAGAAGGCACCCCTGCCCTCCTGAGGTCAGCTGAGCGGTTAATGCGGAAG 24 0 

61LDPLEGTPALLRSAERLMRK 80 

250 270 290 

241 GTTAAGAAACTGCGCCTGGACAAGGAGAACACCGGAAGTTGGAGAAGCTTCTCGCTGAAT 300 

81VKKLRLDKENTGSWRSPSLN 100 

310 330 350 

301 TCCGAGGGGGCTGAGAGGATGGCCACCACCGGGACCCCAACGGCCGACCGAGGCGACGCA 360 

101 SEGAERMATTGTPTADRGDA 120 

370 390 410 

361 GCCGCCACAGATGACCCGGCCGCCCGCTTCCAGGTGCAGAAGC ACTCGTGGGACGGGCTC 420 

121 AATDDPAARFQVQKHSWDGL 140 

430 450 470 

421 CGGAGCATCATCCACGGCAGCCGCAAGTACTCGGGCCTCATTGTCAACAAGGCGCCCCAC 480 

141 RS I IHGSRKYSGLIVNKAPH 160 

490 510 530 

481 GACTTCCAGTTTGTGCAGAAGACGGATGAGTCTGGGCCCCACTCCCACCGCCTCTACTAC 54 0 

161 DFQFVQKTDESGPHSHRLYY 180 

550 570 590 

54 1 CTGGGAATGCCATATGGCAGCCGGGAGAACTCCCTCCTCTACTCTGAGATTCCCAAGAAG 600 

181 LGMPYGS. RENSLLYSEX- "P K K 200 

610 630 650 

601 GTCCGGAAAGAGGCTCTGCTGCTCCTGTCCTGGAAGCAG ATGCTGGATCATTTCCAGGCC 660 

201 VR KE AL LL L SWKQMLDH F Q A 220 

670 690 710 

661 ACGCCCCACCATGGGGTCTACTCTCGGGAGGAGGAGCTGCTGAGGGAGCGGAAACGCCTG 720 

221 TPHHGVYSREEELLRERKRL 240 

730- - 750 ■ . 770 - - 

721 GGGGTCTTCGGCATCACCTCCTACGACTTCCACAGCGAGAGTGGCCTCTTCCTCTTCCAG 780 

241 ' GVFGITSYDFHSESGLFLFQ 260 

790 810 830 

781 GCCAGCAACAGCCTCTTCCACTGCCGCGACGGCGGCAAGAACGGCTTCATGGTGTCCCCT 840 

261 ASNSLFHCRDGGKNGFMVSP 280 

850 870 890 

841 ATGAAACCGCTGGAAATCAAGACCCAGTGCTCAGGGCCCCGGATGGACCCCAAAATCTGC 900 

281 MKPLEI KTQCSGPRMDP- KIC 300 
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910 930 950 

901 CCTGCCGACCCTGCCTTCTTCTCCTTCAACAATAACAGCGACCTGTGGGTGGCCAACATC 960 
301 PADPAFFS FNNNSDLWVANI 320 

970 990 1010 

961 GAG ACAGGCGAGG AGCGGCGGCTGACCTTCTGCCACCAAGGTTTATCCAATGTCCTGG AT 1020 
321 ETGEERRLTFCHQGLSNVLD 340 

1030 1050 1070 

1021 GACCCCAAGTCTGCGGGTGTGGCCACCTTCGTCATACAGGAAGAGTTCGACCGCTTCACT 1080 
341 DPKSAGVATFVIQEEFDRFT 360 

1090 1110 1130 

1081 GGGTACTGGTGGTGCCCCACAGCCTCCTGGGAAGGTTC AGAGGGCCTC AAGACGCTGCGA 114 0 
361 GYWWCPTASWEGSEGLKTLR 380 

1150 1170 1190 

1141* ATCCTGTATGAGGAAGTCGATGAGTCCGAGGTGGAGGTCATTCACGTCCCCTCTCCTGCG 12 00 
381 ILYEEVDESEVEVIHVP S PA* 400 

1210 1230 1250 

1201 CTAGAAGAAAGGAAGACGGACTCGTATCGGTACCCCAGGACAGGCAGCAAGAATCCCAAG 1260 
401 LEERKTDSYRYPRTGSKNPK 420 

1270 1290 1310 

1261 ATTGCCTTGAAACTGGCTGAGTTCCAGACTGACAGCCAGGGCAAGATCGTCTCGACCCAG 1320 
421 IALKLAEFQTDSQGKIVSTQ 440 

1330 1350 1370 

1321 GAGAAGGAGCTGGTGCAGCCCTTCAGCTCGCTGTTCCCGAAGGTGGAGTACATCGCCAGG 13 80 
441 EKELVQP FSSLFPKV EYIAR 460 , 

1390 1410 1430 

13 81 GCCGGGTGGACCCGGGATGGCAAATACGCCTGGGCCATGTTCCTGGACCGGCCCCAGCAG 1440 
461 AGWTRDGKYAWAMFLDRPQQ 480 

1450 1470 ' 1490 

1441 TGGCTCCAGCTCGTCCTCCTCCCCCCGGCCCTGTTCATCCCGAGCACAGAGAATGAGGAG 1500 
481 WLQIiVLLPPALFI PSTE NEE 500 

1510 1530 1550 

1501 CAGCGGCTAGCCTCTGCCAGAGCTGTCCCCAGGAATGTCCAGCCGTATGTGGTGTACGAG 1560 
501 QRL ASARAVPRN VQPYVVYE 520 

1570 1590 1610 

1561 GAGGTCACCAACGTCTGGATCAATGTTCATGACATCTTCTATCCCTTCCCCCAATCAGAG 1620 
521 EVTNVWINVHDIFYPFPQSE 540 

1630 1650 1670 

1621 GGAGAGGACGAGCTCTGCTTTCTCCGCGCCAATGAATGCAAGACCGGCTTCTGCCATTTG 1680 
541 GEDELCFLRANECKTGFCHL 560 

1690 1710 1730 

1681 TACAAAGTCACCGCCGTTTTA/U^TCCCAGGGCTACGATTGGAGTGAGCCCTTCAGCCCC 1740 
561 YKVTAV L KSQGYDWS EP FS P 580 

1750 1770 1790 

1741 GGGGAAGATGAATTTAAGTGCCCCATTAAGGAAGAGATTGCTCTGACCAGCGGTGAATGG 1800 
581 GEDEFKCPIKEEIALTSGEW 600 
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1810 1830 1850 

1801 GAGGTTTTGGCGAGGCACGGCTCCAAGATCTGGGTCAATGAGGAGACCAAGCTGGTGTAC 1860 
601 EVLARHGSKIWVNEETKLVY 620 

.1870 1890 1910 

1861 TTCCAGGGCACCAAGGACACGCCGCTGGAGCACCACCTCTACGTGGTCAGCTATGAGGCG 1920 
621 F QG T KDTPLEHHLY VVS YEA 640 

1930 1950 1970 

1921 GCCGGCGAGATCGTACGCCTCACCACGCCCGGCTTCTCCCATAGCTGCTCCATGAGCCAG 1980 
.641 AGE I VRLTTPGFSHS CSMSQ 660 

1990 2010 2030 

1981 AACTTCGACATGTTCGTCAGCCACTACAGCAGCGTGAGCACGCCGCCCTGCGTGCACGTC 2040 
661 NFDMFVSHYSSVSTPPCVHV 680 

2050 2070 2090 

2041 TACAAGCTGAGCGGCCCCGACGACGACCCCCTGCACAAGCAGCCCCGCTTCTGGGCTAGC 2100 
681 YKLSGPDDDPLHKQPRFWAS 700 

2110 2130 2150 

2101 ATGATGGAGGCAGCCAGCTGCCCCCCGGATTATGTTCCTCCAGAGATCTTCCATTTCCAC 2160 
701 MMEAASCPPDYVPPEIFHFH 720 

2170 2190 2210 

2161 ACGCGCTCGGATGTGCGGCTCTACGGCATGATCTACAAGCCCCACGCCTTGCAGCCAGGG 2220 
721 TRSDVRLYGMIYKPHALQPG 740 

2230 2250 2270 

2221 AAG AAGCACCCCACCGTCCTCTTTGTATATGGAGGCCCCCAGGTGCAGCTGGTGAATAAC 2280 
741 KKHP TVLFVYGGPQVQLVNN 760 

2290 2310 2330 

2281 TCCTTCAAAGGCATCAAGTACTTGCGGCTCAACACACTGGCCTCCCTGGGCTACGCCGTG 234 0 
761 S FKG I KYLRLNTLASLGYAV 780 

2350 2370 2390 

2341 GTTGTGATTGACGGCAGGGGCTCCTGTCAGCGAGGGCTTCGGTTCGAAGGGGCCCTGAAA 2400 
781 VV I D GRGS CQRGLR F E GALK 800 

2410 2430 2450 

2401 AACCAAATGGGCCAGGTGGAGATCGAGGACCAGGTGGAGGGCCTGCAGTTCGTGGCCGAG 2460 
801 N QMG QVE I EDQVEG L Q FVAE 820 

2470 2490 2510 

2461 AAGTATGGCTTCATCGACCTGAGCCGAGTTGCCATCCATGGCTGGTCCTACGGGGGCTTC 2520 
821 KYGF IDLSRVAIHGWSYGGF 840 

2530 2550 2570 

2521 CTCTCGCTCATGGGGCTAATCCACAAGCCCCAGGTGTTCAAGGTGGCCATCGCGGGTGCC 2580 
841 L S L M G L I HKPQVF K V A I AGA 860 

2590 2610 2630 

2581 CCGGTCACCGTCTGGATGGCCTACGACACAGGGTACACTGAGCGCTACATGGACGTCCCT 2640 
861 PVTVWMAYDTGYTERYMDVP 880 

2650 2670 2690 

2641 GAGAACAACCAGCACGGCTATGAGGCGGGTTCCGTGGCCCTGCACGTGGAGAAGCTGCCC 2700 
881 ENNQHGYEAGSVALHVEKLP 900 

2710 2730 2750 
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2701 AATGAGCCCAACCGCTTGCTTATCCTCCACGGCTTCCTGGACGAAAACGTGCACTTTTTC 2760 
901 NEPNRLLILHGFLDENVHFF 920 

2770 2790 2810 

2761 CACACAAACTTCCTCGTCTCCCAACTGATCCGAGCAGGGAAACCTTACCAGCTCCAGATC 2820 
921 HTNFLVSQLIRAGKPYQL'QI 940 

2830 2850 2870 

2821 TACCCCAACGAGAGACACAGTATTCGCTGCCCCGAGTCGGGCGAGCACTATGAAGTCACG 2880 
941 YPNERHSIRCPESGEHYEV'T 960 

2890 2910 2930 

2 881 TTACTGCACTTTCTACAGGAATACCTCTGAGCCTGCCCACCGGGAGCCGCCACATCACAG 294 0 
961 LLHFLQEYL* 

2950 2970 2990 

2941 CACAAGTGGCTGCAGCCTCCGCGGGGAACCAGGCGGGAGGGACTGAGTGGCCCGCGGGCC 3000 



3001 CCAGTGAGGCACTTTGTCCCGCCC ' 3020 



FIGURE 4 



SUBSTITUTE SHEET (RULE 26) RO/AU 



WO 02/34900 PCT/AU01/01388 



8/24 



101 SWTCLRS^ 

' 'IHIIMMMlMMMMIMIIMIMIiMIIMMIM 

_ 1 1 f 1 1 1 1 1 1 ( f 1 1 1 M 1 1 1 1 1 IMMIMMMMMM 1 1 1 1 1 1 1 1 1 1 

47 csrensixyseipkkvrk^^ 96 

I II i I I I I ( I I I I I I I I I : { I I ( f I I I I I I j I I I I I i i i i i i | 

97 ERKRl^VFGITSYDn^SE^LFLFQASNSLFHCRDGCKNCFH 



V 139 



M M m n m ^^^^"finnsdlwvanietceerrl 300 
IIMIIIIIIIIIIIIIM 1 1 1 1 1 1 II ! 1 1 1 1 1 1 1 1 M 1 1 M I i 1 1 1 1 1 1 
uo swkpleiktocscprmdpkicpadpaffsfiwso^ l89 

301 ^^^^^j^^^P^'s AGVATFV I QEEFDRFTCYWWC PTASWE . .EGLKT 348 

240 LRrLYEEVDESEVEVIKVPSPALEERKTDSYRYPRTCS^ 28 9 

399 <^SQGKIV^^ AWAMFLDRP 441 

oon 1 1 1 I I t t I I I I 1 I t I I I I 1 t | 1 1 | f 1 1 I t I I I I i I III IM II 

>Mrx I imilllU (Ml ( I M I IIIIMMIMMMIf I Kit Kill 

340 QQWLQLVLL P PAW I PSTENEEQRLASARAVHWVQ PYVVTCEEVTNVWI N 389 
492 ^^^.^ ^ ^^^^^^Cf'^RANECKTG FCH LYKVTAVLKSQG YEWS EP F 541 
3,0 ffl^^ 0 , 

ITlT^ EQSLTNA IWWECTKLVYFQC?TKiyrP 572 

tlllllfiitt Mill If IlllUMlllli M ill II II II in 



S7 3 
490 



623 T^^^PW^WWMMMEAA KI FHFKTRSOVRLY 663 

r _ iiiiiiiiiiiiii mm ifffffftfffffff 

540 KV^KI^GPDDDPLJiKQPRFWASKMEAASCPPDWPPEIFHFHT 589 
664 ^WI YKPKAL4) PGKKH PTVLFVYOG PQVQLVNNSFKG I KYLRLNTLAS LGY 713 



CHIYKPHAL^ PC KKHPT^FVYCG PQVQLVNNSFKC IKYLRLNTLASLGY 639 

IIltlIIttl||IlllIfI|||I|l|t||||||fl||||||||||||| |f 

AVWIDGRCSCQRCL^BGAUO^ 6B9 . 

MM HUM II (I III UK IK || IK IN III Ml It I 

RVAIHGWSYCCFl^I^LIHKPQVTIWAIAGAP^ 739 

M M I K I K K I I K K II K I II 1 1 K II II K 1 1 1 1 K K 

VPEl^QHGYEAGSVALHV^KLPNEPNRlIlItl^ 789 

864 »-I^CKPYQl^ VAL pWsPQIYPNERHSIRCPESCEW 913 

790 t! I RAGKPYO T I I K I I I II I I K K K I H I II K I I I I I 

790 LIRAGKPYQL QIYPNERHSIRCPESGEHYEVTL-LHFLQEYL 830 



Figure S 



714 
640 
764 
690 
814 
740 



SUBSTITUTE SHEET (RULE 26) RO/AU 



WO 02/34900 



PCT/AU01/01388 



9/24 




FIGURE 6 

SUBSTITUTE SHEET (RULE 26) RO/AU 



WO 02/34900 



10/24 



PCT/AU01/01388 




£ S 3 

*0 VI %F 



III I 1 



£ s a 
^. s ? 

o\ t-^ 



<s 




LU 

o 

a. c o uj 

Q « £ OQ 

8 ? §| 

ca u a. 



FIGURE 7 



SUBSTITUTE SHEET (RULE 26) RO/AU 



WO 02/34900 PCT/AU01/01388 



11/24 



GAP of: hdpp9.aa check: 2050 from: 1 to: 969 
/home/rpag02/Cathy/tedf ami ly/ PATENT/ hdpp 9 . aa [Unknown form] 

to: mdpp9.aa check: 443 6 from: 1 to: 847 

/home/rpag02/Cathy/tedf ami ly/ PATENT/ mdpp 9 .aa [Unknown form] 

Symbol comparison table: /dbase/gcg/gcgcore/data/rundata/nwsgappep . cmp 
CompCheck: 1254 

Gap Weight: 3.000 Average Match: 0.540 

Length Weight: 0.100 Average Mismatch: -0.396 

Quality: 1179.7 Length: 969 

Ratio: 1.3 93 Gaps: 2 

Percent Similarity: 94.215 Percent Identity: 90.555 

hdpp9.aa x mdpp9.aa October 5, 19101 16:00 .. 



51 SHACSWNGGSLDPLEGTPALLRSAERLMRKVKKLRLDKENTGSWRSFSLN 100 



1 



.P 1 



. « • • * 

101 SEGAERMATTGTPTADRGDAAATDDPAARFQVQKHSWDGLRSIIHGSRKY 150 

|::::||. .|....:...|.. ||.|||| I I I I I II I I I I I I I I I I I 
2 SQEPQRMC . SGVSPVEQVAAGDMDDTAARFCVQKHSWDGLRSIIHGSRKS 50 

151 SGLIVNKAPHDFQFVQKTDESGPHSHRLYYLGMPYGSRENSLLYSEIPKK 200 

1 1 1 1 1 - 1 1 1 1 1 1 1 1 1 1 1 - 1 1 1 1 M 1 1 1 1 i 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 M 1 1 1 ! 

51 SGLIVSKAPHDFQFVQKPDESGPHSHRLYYLGMPYGSRENSLLYSEIPKK 100 
2 01 VRKEALLLLSWKQMLDHFQATPHHGVYSREEELLRERKRLGVFGITSYt)F 250 

• iiiiiiimiiMMiiiiiiiifiiiiinmiiiiiimiiim 

101 VRKEAIiLLbSWKQMLDHFQATPHHGVYSREEELLRERKRLGVFGITSYDF 150 
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251 HSESGLFLFQASNSLFHCRDGGKNGFMVSPMKPLEIKTQCSGPRMDPKIC 300 

Mill HIM II IIMIilMMIMIMI III Mill I Mill III 1 1 1 

151 HSESGLFLFQASNSLFHCRDGGKNGFMVSPMKPLEIKTQCSGPRMDPKIC 200 



301 PADPAFFSFNNNSDLWVANIETGEERRLTFCHQGLSNVLDDPKSAGVATF 350 

MMMMI 1 1 1 1 11 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 MIIMIIIIIIII 

201 PADPAFFSFINNSDLWVANIETGEERRLTFCHQGSAGVLDNPKSAGVATF 250 



351 VIQEEFDRFTGYWWCPTASWEGSQGLKTLRILYEEVDESEVEVIHVPSPA 400 

MM M MM hHMMM MhlHMM MM MIIIMMHI III 

251 VIQEEFDRFTGCWWCPTASWEGSEGLKTLRILYEEVDESEVEVIHVPSPA 300 
401 LEERKTDSYRYPRTGSKNPKIALKLAEFQTDSQGKIVSTQEKELVQPFSS 450 

] 1 1 1 1 1 1 ! J 1 1 } f 1 1 ! 1 1 f 1 1 1 1 1 1 1 1 ^ 1 1 1 llllll- MMMMI! 

301 LEERKTDSYRYPRTGSKNPKIALKLAELQTDHQGKIVSSCEKELVQPFSS 350 
451 LFPKVEYIARAGWTRDGKYAWAMFLDRPQQWLQLVLLPPALFIPSTENEE 500 

Ml I II MUM III IIMIMIIIIIMMIIIIMIIIIMI-MM- 
351 LFPKVEYIARAGWTRDGKYAWAMFLDRPQQRLQLVLLPPALFIPAVESEA 400 



501 QRLAS ARAVPRNVQPYVVYEEVTNVWINVHD I FYPFPQS EGEDELCFLRA 550 

II MllllhlllhhlMlllimilllMIIMIM^Mllll 
401 QRQAAARAVTKNVQPFVIYEEVTNWINVHDIFHPFPQAEGQQDFCFLRA 450 

551 NECKTGFCHLYKVTAVLKSQGYDWSEPFSPGEDEFKCPIKEEIALTSGEW 600 

1 1 M 1 1 1 II I h M • IM-MM-IMIMhIIIMIMMIMMII 

451. NECKTGFCHLYRVTVELKTKDYDWTEPLSPTEGEFKCPIKEEVALTSGEW 500 
601 EVLARHGSKIWVNEETKLVYFQGTKDTPLEHHLYWSYEAAGEIVRLTTP 650 

1 1 MM III II I hi I II I IIIIIIIMIIIIIIIIIMIIIIJ.il I 
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501 EVIiSRHGSKIWVNEQTKLVYFQGTKDTPLEHHLYWSYESAGEIVRLTTL 550 
651 GFSHSCSMSQNFDMFVSHYSSVSTPPCVHVYKLSGPDDDPLHKQPRFVJAS 700 

I I i I MIIM-M MINIM I II I Mill III MM M M M II 1 1 Ml 

551 GFSHSCSMSQSFDMFVSHYSSVSTPPCVHVYKLSGPDDDPLHKQPRFWAS 600 

« • - • 

701 MMEAASCPPDYVPPEIFHFHTRSDVRLYGMIYKPHALQPGKKHPTVLFVY 750 

IMIMMMI II MIIIIIIIMIMIMIIIIIMIIIMMIIM II 

601 MMEAANCPPDYVPPEIFHFHTRADVQLYGMIYKPHTLQPGRKHPTVLFVY 650 
751 GGPQVQLVNNSFKGIKYLRLNTIjASIiGYAWVIDGRGSCQRGLRFEGALK 800 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 E 1 1 1 ! 1 1 i 1 1 1 1 1 1 1 1 1 1 ^ 1 1 1 1 1 1 

651 GGPQVQLVNNSFKGIKYLRLNTIiASLGYAVWIDGRGSCQRGLHFEGALK 700 
801 NQMGQVEIEDQVEGLQFVAEKYGFIDLSRVAIHGWSYGGFLSLMGLIHKP 850 

1 1 1 1 1 1 1 1 II 1 1 i 1 1 h 1 1 1 1 1 II i i 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 i 1 1 1 1 1 1 1 

701 NQMGQVEIEDQVEGLQYVAEKYGFIDLSRVAIHGWSYGGFLSLMGLIHKP 750 
851 QVFKVAIAGAPVTTOMAYDTGYTERYMDVPENNQHGYEAGSVALHVEKLP 900 

I I I i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 = 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 

751 QVFKVAIAGAPVTVWMAYDTGYTERYMDVPENNQQGYEAGSVALHVEKLP 800 

901 NEPNRLLILHGFLDENVHFFHTNFLVSQLIRAGKPYQLQIYPNERHSIRC 950 

I II I I I II I I I II I I II I I I I I I II M I I I I M I I! I I h I : 

801 NEPNRLLILHGFLDENVHFFHTNFLVSQLIRAGKPYQLQV ASVTT 845 

951 PESGEHYEVTLLHFLQEYL 969 
h 

846 PQ 847 
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GAP of: dpp9patent.dna check: 1968 from: 1 to: 3000 



/home/rpag02/Cathy/tedf ami ly/ PATENT/ dpp9pa tent . dna [Unknown form] 

to: mdpp9.dna check: 672 from: 1 to: 2 873 

/home/ rpag02/Cathy/tedf amily/ PATENT/ mdpp 9 . dna [Unknown form] 

Symbol comparison table: /dbase/gcg/gcgcore/data/rundata/nwsgapdna. cmp 
CompCheck: 6876 



Gap Weight: 5.000 
Length Weight: 0.300 



Average Match: 
Average Mismatch: 



1.000 
0.000 



Quality: 2166.5 
Ratio: 0.754 
Percent Similarity: 80.637 



Length : 
Gaps : 

Percent Identity: 



3172 
2 

80.637 



dpp9patent .dna x mdpp9.dna October 5, 19101 16:00 



2 51 TGCGCCTGGACAAGGAGAAC ACCGGAAGTTGGAGAAGCTTCTCGCTGAAT 300 



1 



GCCA 4 




5 TCACAGGAGCCCCAGAGGATG . . . TGCAGCGGGGTCTCCCCAGTTGAGCA 51 



351 AGGCGACGCAGCCGCCACAGATGACCCGGCCGCCCGCTTCCAGGTGCAGA 400 
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401 AGCACTCGTGGGACGGGCTCCGGAGCATCATCCACGGCAGCCGCAAGTAC 450 

II Mil 1 1 MM I Mill II Mill II II III II II IIIMM I 

102 AGCACTCGTGGGATGGGCTGCGTAGCATTATCCACGGCAGTCGCAAGTCC 151 

451 TCGGGCCTCATTGTCAACAAGGCGCCCCACGACTTCCAGTTTGTGCAGAA 500 

1 1 1 1 1 11 1 1 1 1 1 1 M I llllll MIIMMMIIMIMMMIMM 

152 TCGGGCCTCATTGTCAGCAAGGCCCCCCACGACTTCCAGTTTGTGCAGAA 201 
501 GACGGATGAGTCTGGGCCCCACTCCCACCGCCTCTACTACCTGGGAATGC 550 

I I 1 1 MINIM MINIM MIM 1 1 1 1 1 II M I NIMH 

202 GCCTGACGAGTCTGGCCCCCACTCTCACCGTCTCTATTACCTCGGAATGC 251 
551 CATATGGCAGCCGGGAGAACTCCCTCCTCTACTCTGAGATTCCCAAGAAG 600 

i ii linn 1 1 iiiiiiiiiiiiiiiiiiii mil urn m 

252 CTTACGGCAGCCGTGAGAACTCCCTCCTCTACTCCGAGATCCCCAAGAAA 3 01 
601 GTCCGGAAAGAGGCTCTGCTGCTCCTGTCCTGGAAGCAGATGCTGGATCA 650 

II Mill Mill II I III 1 1 llllll 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II II 

302 GTGCGGAAGGAGGCCCTGCTGCTGCTGTCCTGGAAGGAGATGCTGGACCA 351 
651 TTTCCAGGCCACGCCCCACCATGGGGTCTACTCTCGGGAGGAGGAGCTGC 700 

MNIINNI 1 1 ! 1 1 1 1 1 1 i I MINIM || i 1 1 1 M II 1 1 1 I 

352 CTTCCAGGCCACACCCCACCATGGTGTCTACTCCCGAGAGGAGGAGCTAC 401 

• • . . . 
701 TGAGGGAGCGGAAACGCCTGGGGGTCTTCGGCATCACCTCCTACGACTTC 750 

II IIIMM II II I III II IIIMM I III III II II llllll 

402 TGCGGGAG03CAAGCGCCTGGGCGTCTTCGGAATCACCTCTTATGACTTC 451 

• ♦ . . « 

751 CACAGCGAGAGTGGCCTCTTCCTCTTCCAGGCCAGCAACAGCCTCTTCCA 800 

1 1 ii i ii 1 1 1 ii 1 1 1 ii ii 1 111 ii ii 1 1 ii 1 1 1 ii 1 1 ii i m ii 

452 CACAGTGAGAGCGGCCTCTTCCTCTTCCAGGCCAGCAATAGCCTGTTCCA 501 
801 CTGCCGCGACGGCGGCAAGAACGGCTTCATGGTGTCCCCTATGAAACCGC 850 

ii 1 1 i ii ii iiiiiiii mil milium Mm ii i 
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502 CTGCAGGGATGGTGGCAAGAATGGCTTTATGGTGTCCCCGATGAAGCCAC 551 

851 TGGAAATCAAGACCCAGTGCTCAGGGCCCCGGATGGACCCCAAAATCTGC 900 

1 1 1 1 MINIM MMI M IMM II MMMMMMMMM 

552 TGGAGATCAAGACTCAGTGTTCTGGGCCACGCATGGACCCCAAAATCTGC 601 

901 CCTGCCGACCCTGCCTTCTTCTCCTTCAACAATAACAGCGACCTGTGGGT 950 

M M MMMMMMM MMMI Ml Mill M MINIM 

602 CCCGCAGACCCTGCCTTCTTTTCCTTCATCAACAACAGTGATCTGTGGGT 651 
951 GGCCAACATCGAGACAGGCGAGGAGCGGCGGCTGACCTTCTGCCACCAAG 1000 

IN 1 1 1 II I II 1 1 1 M IMM MINIM MMMI! INN I 

652 GGCAAACATCGAGACTGGGGAGGAACGGCGGCTCACCTTCTGTCACCAGG 701 
1001 GTTTATCCAATGTCCTGGATGACCCCAAGTCTGCGGGTGTGGCCACCTTC 105 0 

III I I MINIM! I Mill II II II 1 1 1 1 1 1 1 1 1 1 1 

702 GTTCAGCTGGTGTCCTGGACAATCCCAAATCAGCAGGCGTGGCCACCTTT 751 
1051 GTCATACAGGAAGAGTTCGACCGCTTCACTGGGTACTGGTGGTGCCCCAC 1100 

Nlll Mill 1 1 f I ( 1 1 1 II 1 1 II I M i 1 1 M IIIMMMNNII 

752 GTCATCCAGGAGGAGTTCGACCGCTTCACTGGGTGCTGGTGGTGCCCCAC 801 
1101 AGCCTCCTGGGAAGGTTCAGAGGGCCTCAAGACGCTGCGAATCCTGTATG 1150 

inn mum ii n n iiiimiiimi inn i m 

802 GGCCTCTTGGGAAGGCTCCGAAGGTCTCAAGACGCTGCGCATCCTATATG 851 

• . • • 

1151 AGGAAGTCGATGAGTCCGAGGTGGAGGTCATTCACGTCCCCTCTCCTGCG 1200 

III INI II Mill II III III II I III II II Nlll II II 

852 AGGAAGTGGACGAGTCTGAAGTGGAGGTCATTCATGTGCCCTCCCCCGCC 901 
12 01 CTAGAAGAAAGGAAGACGGACTCGTATCGGTACCCCAGGACAGGCAGCAA 1250 

II II II 1 1 1 1 1 1 1 1 1 1 1 i 1 1 II II I II M M II M M 1 1 1 11 i I 

902 CTGGAGGAGAGGAAGACGGACTCCTACCGCTACCCCAGGACAGGCAGCAA 951 
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1251 ^^^^^^-^GATTGCCTTGAAACTGGCTGAGTTCCAGACTGACAGCCAGG 1300 



I Ml 



952 GAACCCCAAGATTGCCCTGAAGCTGGCTGAGCTCCAGACGGACCATCA 1001 
1301 GCAAGATCGTCTCGACCCAGGAGAAGGAGCTGGTGCAGCCCTTCAGCTCG 1350 

r Nil lllll II I I MINIM Mill Mill MINIM 

1002 GCAAAATCGTGTCAAGCTGCGAGAAGGAACTGGTACAGCCATTCAGCTCC 1051 
1351 CTGTTCCCGAAGGTGGAGTACATCGCCAGGGCCGGGTGGACCCGGGATGG 1400 

II NMI II MIMI II MINI I MM M Mill Mill II 

1052 CTTTTCCCCAAAGTGGAGTACATCGCCCGGGCTGGCTGGACACGGGACGG 1101 
1401 CAAATACGCCTGGGCCATGTTCCTGGACCGGCCCCAGCAGTGGCTCCAGC 1450 

MINI M 1 1 1 1 II 1 1 1 II 1 1 1 1 II 1 1 1 1 MINIM MM MM 

1102 CAAATATGCCTGGGCCATGTTCCTGGACCGTCCCCAGCAACGGCTTCAGC 1151 
1451 TCGTCCTCCTCCCCCCGGCCCTGTTCATCCCGAGCACAGAGAATGAGGAG 1500 

I M MM II lllll II || 1 1 III MM I MM Mill 

1152 TTGTCCTCCTGCCCCCTGCTCTCTTCATCCCGGCCGTTGAGAGTGAGGCC 1201 
1501 CAGCGGCTAGCCTCTGCCAGAGCTGTCCCCAGGAATGTCCAGCCGTATGT 1550 

19n , I INN I II MM I II I II I II MM MUM lllll I III 

1202 CAGCGGCAGGCAGCTGCCAGAGCCGTCCCCAAGAATGTGCAGCCCTTTGT 1251 
1551 GGTGTACGAGGAGGTCACCAACGTCTGGATCAATGTTCATGACATCTTCT 1600 

„„ I 11 'I 'I 1 1 1 1 1 M I MIMIIMM II II III MM II 

1252 CATCTATGAAGAAGTCACCAATGTCTGGATCAACGTCCACGACATCTTCC 13 01 

1601 ATCCCTTCCCCCAATCAGAGGGAGAGGACGAGCTCTGCTTTCTCCGCGCC 1650 

™ I M II II II I Mill lllll I II II II II III 

1302 ACCCGTTTCCTCAGGCTGAGGGCCAGCAGGACTTTTGTTTCCTTCGTGGC 1351 

1651 AATGAATGCAAGACCGGCTTCTGCCATTTGTACAAAGTCACCGCCGTTTT 1700 

I I 1 1 M 1 1 1 II 1 1 MIMIIMM MIMI Mill I I I 

1352 AACGAATGCAAGACTGGCTTCTGCCACCTGTACAGGGTCACAGTGGAACT 1401 
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1701 AAAATCCCAGGGCTACGATTGGAGTGAGCCCTTCAGCCCCGGGGAAGATG 1750 

1 1 1 II Ml Ml II Mil II III NIMH I I I I M 

1402 TAAAACCAAGGACTATGACTGGACGGAACCCCTCAGCCCTACAGAAGGTG 1451 

1751 AATTTAAGTGCCCCATTAAGGAAGAGATTGCTCTGACCAGCGGTGAATGG 1800 

I llllllllilllll INN III I II llllllll II (I III 

1452 AGTTTAAGTGCCCCATCAAGGAGGAGGTCGCCCTGACCAGTGGCGAGTGG 1501 
1801 GAGGTTTTGGCGAGGCACGGCTCCAAGATCTGGGTCAATGAGGAGACCAA 1850 

Hill III 1 1 1 1 1 1 1 11111111111111111111 III MM || 

1502 GAGGTCTTGTCGAGGCATGGCTCCAAGATCTGGGTCAACGAGCAGACGAA 1551 

* * 

1851 GCTGGTGTACTTCCAGGGCACCAAGGACACGCCGCTGGAGCACCACCTCT 1900 

llllllllllll II ll ll llllllll llllllll II lllllll 

1552 GCTGGTGTACTTTCAAGGTACAAAGGACACACCGCTGGAACATCACCTCT 1601 
1901 ACGTGGTCAGCTATGAGGCGGCCGGCGAGATCGTACGCCTCACCACGCCC 1950 

i i ii iii ii iii iii i ii iii iii i mi ii i iiiiiii n i 

1602 ATGTGGTCAGCTACGAGTCAGCAGGCGAGATCGTGCGGCTCACCACGCTC 1651 
1951 GGCTTCTCCCATAGCTGCTCCATGAGCCAGAACTTCGACATGTTCGTCAG 2000 

Illlillllll Mlllllllllllllllll MMIMllllllll II 

1652 GGCTTCTCCCACAGCTGCTCCATGAGCCAGAGCTTCGACATGTTCGTGAG 1701 

• • ♦ • 

2001 CCACTACAGCAGCGTGAGCACGCCGCCCTGCGTGCACGTCTACAAGCTGA 2050 

ii inn n 1 1 ininnin inn n n n mi inn i 

1702 TCACTACAGCAGTGTGAGCACGCCACCCTGTGTACATGTGTACAAGCTGA 1751 
2 051 GCGGCCCCGACGACGACCCCCTGCACAAGCAGCCCCGCTTCTGGGCTAGC 2100 

minim n inn minimi n iiiiiiiiiimii 

1752 GCGGCCCCGATGATGACCCACTGCACAAGCAACCACGCTTCTGGGCCAGC 1801 

• • • • • 
2101 ATGATGGAGGCAGCCAGCTGCCCCCCGGATTATGTTCCTCCAGAGATCTT 2150 

J 1 1 1 1 1 1 1 J 1 1 1 1 1 1 ! - iinini ii inn n n immi 

18 02 ATGATGGAGGCAGCCAATTGCCCCCCAGACTATGTGCCCCCTGAGATCTT 1851 
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2151 CCATTTCCACACGCGCTCGGATGTGCGGCTCTACGGCATGATCTACAAGC 220 0 

III If MUM I! I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1852 CCACTTCCACACCCGTGCAGACGTGCAGCTCTACGGCATGATCTACAAGC 1901 
22 01 CCCACGC CTTG CAGC CAGGGAAGAAGCACCCCACCGTCCTCTTTGTATAT 2250 

I III II MM II MM 1 { 1 1 1 1 1 1 1 1 1 1 II MIMIII III 

1902 CACACACCCTGCAACCTGGGAGGAAGCACCCCACTGTGCTCTTTGTCTAT 1951 
2251 GGAGGCCCCCAGGTGCAGCTGGTGAATAACTCCTTCAAAGGCATCAAGTA 23 00 

I I HIM II I II 1 1 II IIIIMI II M M II II I II M M I II 

1952 GGGGGCCCACAGGTGCAGTTGGTGAACAACTCCTTTAAGGGCATCAAATA 20 01 
2301 CTTGCGGCTCAACACACTGGCCTCCCTGGGCTACGCCGTGGTTGTGATTG 2 350 

I lllllll II llllllll III IIIIMI II Mill Mill I 

2002 CCTGCGGCTAAATACACTGGCATCCTTGGGCTATGCTGTGGTGGTGATCG 2051 
2351 ACGGCAGGGGCTCCTGTCAGCGAGGGCTTCGGTTCGAAGGGGCCCTGAAA 2400 

I II 1 1 1 1 1 1 M 1 1 M 1 1 1 1 II II I HIM IIIIIIMIIII 

2 052 ATGGTCGGGGCTCCTGTCAGCGGGGCCTGCACTTCGAGGGGGCCCTGAAA 2101 
24 01 AACCAAATGGGCCAGGTGGAGATCGAGGACCAGGTGGAGGGCCTGCAGTT 2450 

II MMMMMMIMIMM 1 1 1 1 1 M 1 1 1 1 1 1 1 III MUM 

2102 AATCAAATGGGCCAGGTGGAGATTGAGGACCAGGTGGAAGGCTTGCAGTA 2151 
2451 CGTGGCCGAGAAGTATGGCTTCATCGACCTGAGCCX3AGTTGCCATCCATG 2500 

MUM MIIMIMIMIMM III M II I II M I IMIIIMM 

2152 CGTGGCTGAGAAGTATGGCTTCATTGACTTGAGCCGAGTCGCCATCCATG 2201 
2501 GCTGGTCCTACGGGGGCTTCCTCTCGCTCATGGGGCTAATCCACAAGCCC 2550 

1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 M 1 1 1 lllllllllll 1 1 1 1 1 i 1 1 1 1 1 

2202 GCTGGTCCTACGGCGGCTTCCTCTCACTCATGGGGCTCATCCACAAGCCA 2251 

• • » < • 

2551 CAGGTGTTCAAGGTGGCCATCGCGGGTGCCCCGGTCACCGTCTGGATGGC 2600 
II lllllllllll IMM Mill II II Mill II llllllll 
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Rat Multiple Tissue Northern Blot hybridised with a 
human DPP9 probe of 2,589 bases. The hybridisation 
was carried out overnight at 60° C. 
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2252 CAAGTGTTCAAGGTAGCCATTGCGGGCGCTCCTGTCACTGTGTGGATGGC 2301 



2601 CTACGACACAGGGTACACTGAGCGCTACATGGACGTCCCTGAGAACAACC 2 650 

III 1 1 1 1 1 1 1 1 1 ( i f 1 1 II II MINIM Mill M II MM 

2302 CTATGACACAGGGTACACGGAACGATACATGGATGTCCCCGAAAATAACC 2351 



2651 AGCACGGCTATGAGGCGGGTTCCGTGGCCCTGCACGTGGAGAAGCTGCCC 2700 

MM 1 1 1 1 1 1 1 1 1 1 1 II M M MIIIMI IMIIMMMMM 

2352 AGCAAGGCTATGAGGCAGGGTCTGTAGCCCTGCATGTGGAGAAGCTGCCC 2401 
2701 AATGAGCCCAACCGCTTGCTTATCCTCCACGGCTTCCTGGACGAAAACGT 2750 

MIIIMI MMM 1 1 1 1 1 1 1 1 ! I i 1 1 1 M I ( 1 1 1 II M 11 1 Mill 

2402 AATGAGCCTAACCGCCTGCTTATCCTGCACGGCTTCCTGGACGAGAACGT 2451 



2751 GCACTTTTTCCACACAAACTTCCTCGTCTCCCAACTGATCCGAGCAGGGA 2800 

Mill 1 1 1 1 1 1 1 1 1 1 1 Mill M Mill 1 1 1 1 1 i 1 1 1 1 1 1 1 1 I 

2452 TCACTTCTTCCACACAAATTTCCTGGTGTCCCAGCTGATCCGAGCAGGAA 25 01 



2801 AACCTTACCAGCTCCAGAT , . CTACCCCAACGAGAGACACAGTATTCGCT 2 84 8 



2849 GCCCCGAGTCGGGCGAGCACTATGAAGTCACGTTACTGCACTTTCTACAG 2898 

III I Mill III I II II II 

2552 CTCACTAAGACCCCAGTTTTGATGAACCCACTTGGCTACAGGCATGGGAG 2601 

2899 GAATACCTCTGAGCCTGCCCACCGGGAGCCGCCACATCACAGC ACAAGTG 2 948 

II , I II MM I I I I I 

2602 TGCCCCCCAATGATTAGAGACCCAAGAGCAGTTGCCTGAGGGAGAGGACA 2651 

2949 GCTGCAGCCTCCGCGGGGAACCAGGCGGGAGGGACTGAGTGGCCCGCGGG 2998 

I II I I I II I I II I II II 

2 652 TTTAAAGGTCCAGGACTGAATCTACCCAAACGAGAGACATAGCATCCGCT 27 01 

2999 CC 3000 

I 

2702 GCCGCGAGTCCGGAGAGCATTACGAGGTGACGCTGCTGCACTTTCTGCAG 2751 



2502 




2551 
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DPP9 PGR products. 

Lane 2; generated from GEM cell 

line RNA using DPP9 primers 22F and 3* end. 

Lane 4; the same primers with Xbal sites on th 

ends. 
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Figure showing DPP9 PCR products from liver of 

six mice ( numbered 1 to 6) 

and the largest human DPP9 fragment. 
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SEQUENCE LISTING 



<110> THE UNIVERSITY OF SYDNEY 

<120> DIPEPTIDYL PEPTIDASES 

<130> FP15217 

<160> 8 

<170> Patentln version 3.1 

<210> 1 

<211> 3000 

<212> DNA . 

<213> Homo sapiens 

<400> 1 < 

cggcgggtcc cctgtgtccg ccgcggctgt cgtcccccgc tcccgccact tccggggtcg 



cagtcccggg catggagccg cgaccgtgag gcgccgctgg acccgggacg acctgcccag 
120 



tccggccgcc gccccacgtc ccggtctgtg tcccacgcct gcagctggaa tggaggctct 
180 

ctggaccctt tagaaggcac ccctgccctc ctgaggtcag ctgagcggtt aatgcggaag 
240 

gttaagaaac tgcgcctgga caaggagaac accggaagtt ggagaagctt ctcgctgaat 
300 " ~ ~ - ^ " 



tccgaggggg ctgagaggat ggccaccacc gggaccccaa cggccgaccg aggcgacgca 



gccgccacag atgacccggc cgcccgcttc caggtgcaga agcactcgtg ggacgggctc 
420 

cggagcatca tccacggcag ccgcaagtac tcgggcctca ttgtcaacaa ggcgccccac 

480 ^ " " 



gacttccagt ttgtgcagaa gacggatgag tctgggcccc actcccaccg cctctactac 
540 

ctgggaatgc catatggcag ccgggagaac tccctcctct actctgagat tcccaagaag 
600 



gtccggaaag aggctctgct gctcctgtcc tggaagcaga tgctggatca tttccaggcc 
660 



acgccccacc atggggtcta ctctcgggag gaggagctgc tgagggagcg gaaacgcctg 
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720 



ggggtcttcg gcatcacctc 
780 



gccagcaaca gcctcttcca 
840 

atgaaaccgc tggaaatcaa 
900 

cctgccgacc ctgccttctt 
960 



gagacaggcg aggagcggcg 
1020 



gaccccaagt ctgcgggtgt 
1080 



gggtactggt ggtgccccac 
1140 



atcctgtatg aggaagtcga 
1200 



ctagaagaaa ggaagacgga 
12 60 



attgccttga aactggctga 
1320 



gagaaggagc tggtgcagcc 
1380 



gccgggtgga cccgggatgg 
1440 



tggctccagc tcgtcctcct 
1500 



cagcggctag cctctgccag 
1560 



gagigtcacca acgtctggat 
1620 

ggagaggacg agctctgctt 
1680 



tacaaagtca ccgccgtttt 
1740 
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ctacgacttc cacagcgaga gtggcctctt cctcttccag 

ctgccgcgac ggcggcaaga acggcttcat ggtgtcccct 

gacccagtgc tcagggcccc ggatggaccc caaaatctgc 

ctccttcaac aataacagcg acctgtgggt ggccaacatc 

gctgaccttc tgccaccaag gtttatccaa tgtcctggat 

ggccaccttc gtcatacagg aagagttcga ccgcttcact 

agcctcctgg gaaggttcag agggcctcaa gacgctgcga 

tgagtccgag gtggaggtca ttcacgtccc ctctcctgcg 

ctcgtatcgg taccccagga caggcagcaa gaatcccaag 

gttccagact gacagccagg gcaagatcgt ctcgacccag 

cttcagctcg ctgttcccga aggtggagta catcgccagg 

caaatacgcc tgggccatgt tcctggaccg gccccagcag 

ccccccggcc ctgttcatcc cgagcacaga gaatgaggag 

agctgtcccc aggaatgtcc agccgtatgt ggtgtacgag 

caatgttcat gacatcttct atcccttccc ccaatcagag 

tctccgcgcc aatgaatgca agaccggctt ctgccatttg 

aaaatcccag ggctacgatt ggagtgagcc cttcagcccc 
Page 2 



WO 02/34900 

Untitleci.ST25.txt 



ggggaagatg aatttaagtg ccccattaag gaagagattg 
1800 



gaggttttgg cgaggcacgg ctccaagatc tgggtcaatg 
1860 



ttccagggca ccaaggacac gccgctggag caccacctct 
1920 



gccggcgaga tcgtacgcct caccacgccc ggcttctccc 
1980 



aacttcgaca tgttcgtcag ccactacagc agcgtgagca 
2040 



tacaagctga gcggccccga cgacgacccc ctgcacaagc 
2100 



atgatggagg cagccagctg ccccccggat tatgttcctc 
2160 



acgcgctcgg atgtgcggct ctacggcatg atctacaagc 
2220 



aagaagcacc ccaccgtcct ctttgtatat ggaggccccc 
2280 



tccttcaaag gcatcaagta cttgcggctc aacacactgg 
2340 



, gttgtgattg acggcagggg ctcctgtcag cgagggcttc 
2400 



aaccaaatgg gccaggtgga gatcgaggac caggtggagg 
24 60 



aagtatggct tcatcgacct gagccgagtt gccatccatg 
2520 



ctctcgctca tggggctaat ccacaagccc caggtgttca 
2580 



ccggtcaccg tctggatggc ctacgacaca gggtacactg 
2640 

gagaacaacc agcacggcta tgaggcgggt tccgtggccc 
2700 



aatgagccca accgcttgct tatcctccac ggcttcctgg 
27 60 
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ctctgaccag cggtgaatgg 
aggagaccaa gctggtgtac 
acgtggtcag ctatgaggcg 
atagctgctc catgagccag 
cgccgccctg cgtgcacgtc 
agccccgctt ctgggctagc 
cagagatctt ccatttccac 
cccacgcctt gcagccaggg 
aggtgcagct ggtgaataac 
cctccctggg ctacgccgtg 
ggttcgaagg ggccctgaaa 
gcctgcagtt cgtggccgag 
gctggtccta cgggggcttcf 
aggtggccat cgcgggtgcc 
agcgctacat ggacgtccct 
tgcacgtgga gaagctgccc* 
acgaaaacgt gcactttttc 
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cacacaaact tcctcgtctc ccaactgatc cgagcaggga aaccttacca gctccagatc 
2820 

taccccaacg agagacacag tattcgctgc cccgagtcgg gcgagcacta tgaagtcacg 
2880 

ttactgcact ttctacagga atacctctga gcctgcccac cgggagccgc cacatcacag 
2940 

cacaagtggc tgcagcctcc gcggggaacc aggcgggagg gactgagtgg cccgcgggcc 
3000 



<210> 2 

<211> 969 

<212> PRT 

<213> Homo sapiens 

<400> 2 

Arg Arg Val Pro Cys Val Arg Arg Gly Cys Arg Pro Pro Leu Pro Pro 
15 10 15 



Leu Pro Gly Ser Gin Ser Arg Ala Trp Ser Arg Asp Arg Glu Ala Pro 

20 25 - 30 



Leu Asp Pro Gly Arg Pro Ala Gin Ser Gly Arg Arg Pro Thr Ser Arg 
35 40 45 



Ser Val Ser His Ala Cys Ser Trp Asn Gly Gly Ser Leu Asp Pro Leu 
50 55 60 



Glu Gly Thr Pro Ala Leu Leu Arg Ser Ala Glu Arg Leu Met Arg Lys 
65 70 75 ' 80 



Val Lys Lys Leu Arg Leu Asp Lys Glu Asn Thr Gly Ser Trp Arg Ser 
85 90 ~ 95 



Phe Ser Leu Asn Ser Glu Gly Ala Glu Arg Met Ala Thr Thr Gly Thr 
100 105 ' 110 



Pro Thr Ala Asp Arg Gly Asp Ala Ala Ala Thr Asp Asp Pro Ala Ala 
115 120 125 
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Arg Phe Gin Val Gin Lys His Ser Trp Asp Gly Leu Arg Ser He He 
130 135 140 



His Gly Ser Arg Lys Tyr Ser Gly Leu He Val Asn Lys Ala Pro His 
145 150 155 160 



Asp Phe Gin Phe Val Gin Lys Thr Asp Glu Ser Gly Pro His Ser His 
165 170 "* 175 



Arg Leu Tyr Tyr Leu Gly Met Pro Tyr Gly Ser Arg Glu Asn Ser Leu 
180 185 ~ 190 



Leu Tyr Ser Glu He Pro Lys Lys Val Arg Lys Glu Ala Leu Leu Leu 
195 200 205 



Leu Ser Trp Lys Gin Met Leu Asp His Phe Gin Ala Thr Pro His His 
210 215 220 



Gly Val Tyr Ser Arg Glu Glu Glu Leu Leu Arg Glu Arg Lys Arg Leu 
225 230 235 ~ 240 



Gly Val Phe Gly He Thr Ser Tyr Asp Phe His Ser Glu Ser Gly Leu 
245 250 255 



Phe Leu Phe Gin Ala Ser Asn Ser Leu Phe His Cys Arg Asp Gly Gly 
260 265 " 270 



Lys Asn Gly Phe Met Val Ser Pro Met Lys Pro Leu Glu He Lys Thr 
275 280 285 



Gin Cys Ser Gly Pro Arg Met Asp Pro Lys He Cys Pro Ala Asp Pro 
290 295 300 



Ala Phe Phe Ser Phe Asn Asn Asn Ser Asp Leu Trp Val Ala Asn He 
305 310 315 ' 320 



Glu Thr Gly Glu Glu Arg Arg Leu Thr Phe Cys His Gin Gly Leu Ser 
325 330 335 
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Asn Val Leu Asp Asp Pro Lys Ser Ala Gly Val Ala Thr Phe Val He 

340 345 350 



Gin Glu Glu Phe Asp Arg Phe Thr Gly Tyr Trp Trp Cys Pro Thr Ala 
355 360 * 365 



Ser Trp Glu Gly Ser Gin Gly Leu Lys Thr Leu Arg He Leu Tyr Glu 
370 375 380 



Glu Val Asp Glu Ser Glu Val Glu Val He His Val Pro Ser Pro Ala 
385 390 395 400 



Leu Glu Glu Arg Lys Thr Asp Ser Tyr Arg Tyr Pro Arg Thr Gly Ser 
405 410 - 415 



Lys Asn Pro Lys He Ala Leu Lys Leu Ala Glu Phe Gin Thr Asp Ser 
420 425 430 



Gin Gly Lys He Val Ser Thr Gin Glu Lys Glu Leu Val Gin Pro Phe 
435 440 445 



Ser Ser Leu Phe Pro Lys Val Glu Tyr He Ala Arg Ala Gly Trp Thr 
450 455 460 



Arg Asp Gly Lys Tyr Ala Trp Ala Met Phe Leu Asp Arg Pro Gin Gin 
465 470 475 480 



Trp Leu Gin Leu Val Leu Leu Pro Pro Ala Leu Phe lie Pro Ser Thr 
485 490 495 



Glu Asn Glu Glu Gin Arg Leu Ala Ser Ala Arg Ala Val Pro Arg Asn 
500 505 510 



Val Gin Pro Tyr Val Val Tyr Glu Glu Val Thr Asn Val Trp He Asn 
515 520 525 



Val His Asp lie Phe Tyr Pro Phe Pro Gin Ser Glu Gly Glu Asp Glu 
530 535 540 
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Leu Cys Phe Leu Arg Ala Asn Glu Cys Lys Thr Gly Phe Cys His Leu 

545 550 - 555 "* 560 



Tyr Lys Val Thr Ala Val Leu Lys Ser Gin Gly Tyr Asp Trp Ser Glu 
565 570 ~ 575 



Pro Phe Ser Pro Gly Glu Asp Glu Phe Lys Cys Pro He Lys Glu Glu 
580 585 590 



He Ala Leu Thr Ser Gly Glu Trp Glu Val Leu Ala Arg His Gly Ser 
595 600 605 



Lys He Trp Val Asn Glu Glu Thr Lys Leu Val Tyr Phe Gin Gly Thr 
610 615 620 



Lys Asp Thr Pro Leu Glu His His Leu Tyr Val Val Ser Tyr Glu Ala 
625 630 " J 635 "* 640 



Ala Gly Glu He Val Arg Leu Thr Thr Pro Gly Phe Ser His Ser Cys 
645 650 ~ 655 



Ser Met Ser Gin Asn Phe Asp Met Phe Val Ser His Tyr Ser Ser Val 
660 665 670 



Ser Thr Pro Pro Cys Val His Val Tyr Lys Leu Ser Gly Pro Asp Asp 
675 680 685 



Asp Pro Leu His Lys Gin Pro Arg Phe Trp Ala Ser Met Met Glu Ala 
690 695 700 



Ala Ser Cys Pro Pro Asp Tyr Val Pro Pro Glu He Phe His Phe His 
705 710 715 720 



Thr Arg Ser Asp Val Arg Leu Tyr Gly Met He Tyr Lys Pro His Ala 
725 730 " 735 



Leu Gin Pro Gly Lys Lys His Pro Thr Val Leu Phe Val Tyr Gly Gly 
740 745 750 
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Pro Gin Val Gin Leu Val Asn Asn Ser Phe Lys Gly lie Lys Tyr Leu 

755 760 " 765 



Arg Leu Asn Thr Leu Ala Ser Leu Gly Tyr Ala Val Val Val He Asp 
770 775 780 



Gly Arg Gly Ser Cys Gin Arg Gly Leu Arg Phe Glu Gly Ala Leu Lys 
785 790 795 800 



Asn Gin Met Gly Gin Val Glu He Glu Asp Gin Val Glu Gly Leu Gin 
805 810 815 



Phe Val Ala Glu Lys Tyr Gly Phe He Asp Leu Ser Arg Val Ala He 

820 825 830 

His Gly Trp Ser Tyr Gly Gly Phe Leu Ser Leu Met Gly Leu He His 

835 840 845 



Lys Pro Gin Val Phe Lys Val Ala He Ala Gly Ala Pro Val Thr Val 
850 855 860 



Trp Met Ala Tyr Asp Thr Gly Tyr Thr Glu Arg Tyr Met Asp Val Pro 
865 870 875 " 880 



Glu Asn Asn Gin His Gly Tyr Glu Ala Gly Ser Val Ala Leu His Val 
885 890 895 



Glu Lys Leu Pro Asn Glu Pro Asn Arg Leu Leu He Leu His Gly Phe 
900 905 910 



Leu Asp Glu Asn Val His Phe Phe His Thr Asn Phe Leu Val Ser Gin 
915 920 925 



Leu He Arg Ala Gly Lys Pro Tyr Gin Leu Gin He Tyr Pro Asn Glu 
930 935 940 



Arg His Ser He Arg Cys Pro Glu Ser Gly Glu His Tyr Glu Val Thr 
945 950 955 " 960 
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Leu Leu His Phe Leu Gin Glu Tyr Leu 
965 



<210> 3 

<211> 3287 

<212> DNA 

<213> Mus musculus 

<400> 3 

ccatcacagg agccccagag gatgtgcagc ggggtctccc cagttgagca ggtggccgca 
60 

ggggacatgg atgacacggc agcacgcttc tgtgtgcaga agcactcgtg ggatgggctg 
120 

cgtagcatta tccacggcag tcgcaagtcc tcgggcctca ttgtcagcaa ggccccccac 
180 

gacttccagt ttgtgcagaa gcctgacgag tctggccccc actctcaccg tctctattac 
240 

ctcggaatgc cttacggcag ccgtgagaac tccctcctct actccgagat ccccaagaaa 
300 

gtgcggaagg aggccctgct gctgctgtcc tggaagcaga tgctggacca cttccaggcc 
360 

acaccccacc atggtgtcta ctcccgagag gaggagctac tgcgggagcg caagcgcctg 
420 

ggcgtcttcg gaatcacctc ttatgacttc cacagtgaga gcggcctctt cctcttccag 
480 

gccagcaata gcctgttcca ctgcagggat ggtggcaaga atggctttat ggtgtccccg 
540 

atgaagccac tggagatcaa gactcagtgt tctgggccac gcatggaccc caaaatctgc 
600 

cccgcagacc ctgccttctt ttccttcatc aacaacagtg atctgtgggt ggcaaacatc 
660 

gagactgggg aggaacggcg gctcaccttc tgtcaccagg gttcagctgg tgtcctggac 
720 

aatcccaaat cagcaggcgt ggccaccttt gtcatccagg aggagttcga ccgcttcact 
780 

gggtgctggt ggtgccccac ggcctcttgg gaaggctccg aaggtctcaa gacgctgcgc 
840 
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atcctatatg aggaagtgga cgagtctgaa gtggaggtca ttcatgtgcc ctcccccgcc 
900 

ctggaggaga ggaagacgga ctcctaccgc taccccagga caggcagcaa gaaccccaag 
960 



attgccctga agctggctga gctccagacg gaccatcagg gcaaaatcgt gtcaagctgc 
1020 



gagaaggaac tggtacagcc attcagctcc cttttcccca aagtggagta catcgcccgg 
1080 



gctggctgga cacgggacgg caaatatgcc tgggccatgt tcctggaccg tccccagcaa 
1140 



cggcttcagc ttgtcctcct gccccctgct ctcttcatcc cggccgttga gagtgaggcc 
1200 



cagcggcagg cagctgccag agccgtcccc aagaatgtgc agccctttgt catctatgaa 
1260 



gaagtcacca atgtctggat caacgtccac gacatcttcc acccgtttcc tcaggctgag 
1320 " " ~" " 



ggccagcagg acttttgttt ccttcgtgcc aacgaatg'ca agactggctt ctgccacctg 
1380 



tacagggtca cagtggaact taaaaccaag gactatgact ggacggaacc cctcagccct 
1440 

acagaaggtg agtttaagtg ccccatcaag gaggaggtcg ccctgaccag tggcgagtgg 
1500 



gaggtcttgt cgaggcatgg ctccaagatc tgggtcaacg agcagacgaa gctggtgtac 
1560 



tttcaaggta caaaggacac accgctggaa catcacctct atgtggtcag ctacgagtca 
1620 



gcaggcgaga tcgtgcggct caccacgctc ggcttctccc acagctgctc catgagccaq 
1680 



agcttcgaca tgttcgtgag tcactacagc agtgtgagca cgccaccctg tgtacatgtg 
1740 

tacaagctga gcggccccga tgatgaccca ctgcacaagc aaccacgctt ctgggccagc 
i800 

atgatggagg cagccaattg ccccccagac tatgtgcccc ctgagatctt ccacttccac 
18 60 



acccgtgcag acgtgcagct ctacggcatg atctacaagc cacacaccct gcaacctggg 
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1920 



aggaagcacc ccactgtgct ctttgtctat gggggcccac aggtgcagtt ggtgaacaac 
1980 



tcctttaagg gcatcaaata cctgcggcta aatacactgg catccttggg ctatgctgtg 
2040 



gtggtgatcg atggtcgggg ctcctgtcag cggggcctgc acttcgaggg ggccctgaaa 
2100 



aatcaaatgg gccaggtgga gattgaggac caggtggaag gcttgcagta cgtggctgag 
2160 



aagtatggct tcattgactt gagccgagtc gccatccatg gctggtccta cggcggcttc 
2220 



ctctcactca tggggctcat ccacaagcca caagtgttca aggtagccat tgcgggcgct 
2280 



cctgtcactg tgtggatggc ctatgacaca gggtacacgg aacgatacat ggatgtcccc 
2340 



gaaaataacc agcaaggcta tgaggcaggg tctgtagccc tgcatgtgga gaagctgccc 
2400 



aatgagccta accgcctgct tatcctccac ggcttcctgg acgagaacgt tcacttcttc 
2460 



cacacaaatt tcctggtgtc ccagctgatc cgagcaggaa agccatacca gcttcagatc 



tacccaaacg agagacatag catccgctgc cgcgagtccg gagagcatta cgaggtgacg 
2580 



ctgctgcact ttctgcagga acacctgtga cctcagtccc gactcctgac gccaccgctg 
2640 



ctcttcttgc gtttttgtaa tcttttcatt tttgaagctt ccaatttgct tgctgctgct 
2700 



gctgcctggg ggccaggaca gaggtagtgg cggcccccat gccgccctcc ttgagctggt 
2760 



gaggagaagt cgccattgag cacacaacct ccaccagact gccatggccc cgaacctgca 
2820 



attccatcct agcgcagaag catgtgcctg ccacctgctg cccctgcaga gtcatgtgtg 
2880 ■ " ~ " ~ 



tttgtggtgg gcattttaaa taattattta aaagacagga agtaagcggt accgagcaat 
2940 



2520 
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gaaactgaag gtacagcact gggcgtctgg ggaccccacg ctctcccaac gcccagacta 
3000 

tgtggagctg ccaagcccct gtctgggcac ctctgccctg cctgtctgct gcccggatcc 
3060 

tcctcactta gcacctaggg gtgtcagggt cgggagtagg acctgtcctg acctcagggt 
3120 

tatatatagc ccttccccac tccctcctac gagagttctg gcataaagaa gtaaaaaaaa 
3180 

aaaaaaaaaa aacaaacaaa aaaaccaaac cacctctaca tattatggaa agaaaatatt 
3240 

tttgtcaatt cttattcttt tataattatg tggtatgtag actcatt 
3287 



<210> 4 

<211> 869 

<212> . PRT 

<213> Mus musculus 

<400> 4 

Pro Ser Gin Glu Pro Gin Arg Met Cys Ser Gly Val Ser Pro Val Glu 
15 10 15 



Gin Val Ala Ala Gly Asp Met Asp Asp Thr Ala Ala Arg Phe Cys Val 
20 25 30 



Gin Lys His Ser Trp Asp Gly Leu Arg Ser lie He His Gly Ser Arg 
35 40 45 



Lys Ser Ser Gly Leu He Val Ser Lys Ala Pro His Asp Phe Gin Phe 
50 55 60 



Val Gin Lys Pro Asp Glu Ser Gly Pro His Ser His Arg Leu Tyr Tyr 
65 70 75 80 



Leu Gly Met Pro Tyr Gly Ser Arg Glu Asn Ser Leu Leu Tyr Ser Glu 
85 90 95 



He Pro Lys Lys Val Arg Lys Glu Ala Leu 
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100 105 110 



Gin Met Leu Asp His Phe Gin Ala Thr Pro His His Gly Val Tyr Ser 
115 120 125 



Arg Glu Glu Glu Leu Leu Arg Glu Arg Lys Arg Leu Gly Val Phe Gly 
130 135 * 140 



He Thr Ser Tyr Asp Phe His Ser Glu Ser Gly Leu Phe Leu Phe Gin 

145 150 155 160 

Ala Ser Asn Ser Leu Phe His Cys Arg Asp Gly Gly Lys Asn Gly Phe 

165 170 ~ " ~ 175 



Met Val Ser Pro Met Lys Pro Leu Glu He Lys Thr Gin Cys Ser Gly 
180 185 190 



Pro Arg Met Asp Pro Lys He Cys Pro Ala Asp Pro Ala Phe Phe Ser 
195 200 205 



Phe He Asn Asn Ser Asp Leu Trp Val Ala Asn He Glu Thr Gly Glu 
210 215 220 



Glu Arg Arg Leu Thr Phe Cys His Gin Gly Ser Ala Gly Val Leu Asp 
225 230 235 " 240 



Asn Pro Lys Ser Ala Gly Val Ala Thr Phe Val He Gin Glu Glu Phe 
245 250 255 



Asp Arg Phe Thr Gly Cys Trp Trp Cys Pro Thr Ala Ser Trp Glu Gly 
260 265 .270 



Ser Glu Gly Leu Lys Thr Leu Arg He Leu Tyr Glu Glu Val Asp Glu 
275 280 285 



Ser Glu Val Glu Val lie His Val Pro Ser Pro Ala Leu Glu Glu Arg 
290 295 300 



Lys Thr Asp Ser Tyr Arg Tyr 



Pro Arg Thr Gly Ser Lys Asn 
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305 310 315 . 320 



lie Ala Leu Lys Leu Ala Glu Leu Gin Thr Asp His Gin Gly Lys lie 
325 330 335 



Val Ser Ser Cys Glu Lys Glu Leu Val Gin Pro Phe Ser Ser Leu Phe 
340 345 350 



Pro Lys Val Glu Tyr He Ala Arg Ala Gly Trp Thr Arg Asp Gly Lys 
355 360 365 



Tyr Ala Trp Ala Met Phe Leu Asp Arg Pro Gin Gin Arg Leu Gin Leu 
370 375 380 



Val Leu Leu Pro Pro Ala Leu Phe He Pro Ala Val Glu Ser Glu Ala 
385 390 395 400 



Gin Arg Gin Ala Ala Ala Arg Ala Val Pro Lys Asn Val Gin Pro Phe 
405 410 415 



Val He Tyr Glu Glu Val Thr Asn Val Trp He Asn Val His Asp He 
420 425 430 



Phe His Pro Phe Pro Gin Ala Glu Gly Gin Gin Asp Phe Cys Phe Leu 
435 440 445 



Arg Ala Asn Glu Cys Lys Thr Gly Phe Cys His Leu Tyr Arg Val Thr 
450 455 460 



Val Glu Leu Lys Thr Lys Asp Tyr Asp Trp Thr Glu Pro Leu Ser Pro 
465 470 475 480 



Thr Glu Gly Glu Phe Lys Cys Pro He Lys Glu Glu Val Ala Leu Thr 
485 490 495 



Ser Gly Glu Trp Glu Val Leu Ser Arg His Gly Ser Lys He Trp Val 
500 505 510 



Asn Glu Gin Thr Lys Leu Val 



Tyr Phe Gin Gly Thr Lys Asp Thr Pro 
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515 520 525 



Leu Glu His His Leu Tyr Val Val Ser Tyr Glu Ser Ala Gly Glu lie 
530 535 540 



Val Arg Leu Thr Thr Leu Gly Phe Ser His Ser Cys Ser Met Ser Gin 
545 550 555 560 



Ser Phe Asp Met Phe Val Ser His Tyr Ser Ser Val Ser Thr Pro Pro 
565 570 575 



Cys Val His Val Tyr Lys Leu Ser Gly Pro Asp Asp Asp Pro Leu His 
580 585 590 



Lys Gin Pro Arg Phe Trp Ala Ser Met Met Glu Ala Ala Asn Cys Pro 
595 600 605 



Pro Asp Tyr Val Pro Pro Glu He Phe His Phe His Thr Arg Ala Asp 
610 615 620 



•Val Gin Leu Tyr Gly Met He Tyr Lys Pro His Thr Leu Gin Pro Gly 
625 630 635 640 



Arg Lys His Pro Thr Val Leu Phe Val Tyr Gly Gly Pro Gin Val Gin 
645 650 ' - 655 



Leu Val Asn Asn Ser Phe Lys Gly He Lys Tyr Leu Arg Leu Asn Thr 
660 665 "* 670 



Leu Ala Ser Leu Gly Tyr Ala Val Val Val He Asp Gly Arg Gly Ser 
675 680 *~ 685 



Cys Gin Arg Gly Leu His Phe Glu Gly Ala Leu Lys Asn Gin Met Gly 
690 695 700 



Gin Val Glu He Glu Asp Gin Val Glu Gly Leu Gin Tyr Val Ala Glu 
705 710 715 720 



Lys Tyr Gly Phe He Asp Leu Ser Arg Val Ala He His 
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725 730 735 



Tyr Gly Gly Phe Leu Ser Leu Met Gly Leu He His Lys Pro Gin Val 
740 745 750 



Phe Lys Val Ala He Ala Gly Ala Pro Val Thr Val Trp Met Ala Tyr 
755 760 765 



Asp Thr Gly Tyr Thr Glu Arg Tyr Met Asp Val Pro Glu Asn Asn Gin 
770 775 " 780 



Gin Gly Tyr Glu Ala Gly Ser Val Ala Leu His Val Glu Lys Leu Pro 
785 790 795 800 



Asn Glu Pro Asn Arg Leu Leu He Leu His Gly Phe Leu Asp Glu Asn 
805 810 815 



Val His Phe Phe His Thr Asn Phe Leu Val Ser Gin Leu He Arg Ala 
820 825 830 



Gly Lys Pro Tyr Gin Leu Gin He Tyr Pro Asn Glu Arg His Ser He 
835 840 845 



Arg Cys Arg Glu Ser Gly Glu His Tyr Glu Val Thr Leu Leu His Phe 
850 855 860 



Leu Gin Glu His Leu 



865 




<210> 


5 


<211> 


3120 


<212> 


DNA 


<213> 


Homo 


<400> 


5 



aagtgctaaa gcctccgagg ccaaggccgc tgctactgcc gccgctgctt cttagtgccg 
60 

cgttcgccgc ctgggttgtc accggcgccg ccgccgagga agccactgca accaggaccg 
120 



gagtggaggc ggcgcagcat gaagcggcgc aggcccgctc catagcgcac gtcgggacgg 
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180 



tccgggcggg gccgggggga 
240 



ctgggtgttg agatatttga 
300 



cctaaattgg agccttttta 
360 



gccgatacca gaaaatatca 
420 



gtgaagagga atgatccaga 
480 



ggtgagaaca gagaaaatac 
540 



gcagtcttaa tgctctcttg 
600 



ggaatgtatt ctcgagaaga 
660 



attgcttctt acgattatca 
720 



atttatcacg taaaagatgg 
780 



ctagtggaaa ctagttgtcc 
840 



gactggattg cttttataca 
900 



gaaaggagac tcacttatgt 
960 



gctggagtcg ctacctttgt 
1020 



tgtccaaaag ctgaaacaac 
1080 



aatgatgaat ctgaggtgga 
1140 



gcagattcat tccgttatcc 
1200 



Untitled.ST25.txt 

aggaaaatgc aacatggcag 

aactgcggac tgtgaggaga 

tgttgagcgg tattcctgga 

tggctacatg atggctaagg 

tggacctcat tcagacagaa 

actgttttat tctgaaattc 

gaagcctctt ttggatcttt 

agaactatta agagaaagaa 

ccaaggaagt ggaacatttc 

agggccacaa ggatttacgc 

caacatacgg atggatccaa 

tagcaacgat atttggatat 

gcacaatgag ctagccaaca 

tctccaagaa gaatttgata 

tcccagtggt ggtaaaattc 

aattattcat gttacatccc 

taaaacaggt acagcaaatc 
Page 17 



PCT/AU01/01388 

cagcaatgga aacagaacag 
atattgaatc acaggatcgg 
gtcagcttaa aaagctgctt 
caccacatga tttcatgttt 
tctattacct tgccatgtct 
ccaaaactat caatagagca 
ttcaggcaac actggactat 
aacgcattgg aacagtcgga 
tgtttcaagc cggtagtgga 
aacaaccttt aaggcccaat 
aattatgccc cgctgatcca 
ctaacatcgt aaccagagaa 
tggaagaaga tgccagatca 
gatattctgg ctattggtgg 
ttagaattct atatgaagaa 
ctatgttgga aacaaggagg 
ctaaagtcac ttttaagatg 
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tcagaaataa tgattgatgc tgaaggaagg atcatagatg tcatagataa ggaactaatt 
1260 



caaccttttg agattctatt tgaaggagtt gaatatattg ccagagctgg atggactcct 
1320 



gagggaaaat atgcttggtc catcctacta gatcgctccc agactcgcct acagatagtg 
1380 



ttgatctcac ctgaattatt tatcccagta gaagatgatg ttatggaaag gcagagactc 
1440 



attgagtcag tgcctgattc tgtgacgcca ctaattatct atgaagaaac aacagacatc 
1500 



tttatttttg cctctgaatg caaaacaggt ttccgtcatt tatacaaaat tacatctatt 
1620 



ttaaaggaaa gcaaatataa acgatccagt ggtgggctgc ctgctccaag tgatttcaag 
1680 



tgtcctatca aagaggagat agcaattacc agtggtgaat gggaagttct tggccggcat 
1740 



ggatctaata tccaagttga tgaagtcaga aggctggtat attttgaagg caccaaagac 
1800 



tcccctttag agcatcacct gtacgtagtc agttacgtaa atcctggaga ggtgacaagg 
1860 



ctgactgacc gtggctactc acattcttgc tgcatcagtc agcactgtga cttctttata 
1920 



agtaagtata gtaaccagaa gaatccacac tgtgtgtccc tttacaagct atcaagtcct 
1980 



gaagatgacc caacttgcaa aacaaaggaa ttttgggcca ccattttgga ttcagcaggt 
2040 



cctcttcctg actatactcc tccagaaatt ttctcttttg aaagtactac tggatttaca 
2100 



ctgttcatat atggtggtcc tcaggtgcag ttggtgaata atcggtttaa aggagtcaag 
2220 



tggataaata tccatgacat ctttcatgtt 
1560 



tttccccaaa gtcacgaaga 



ggaaattgag 



ttgtatggga tgctctacaa gcctcatgat 
2160 



ctacagcctg gaaagaaata 



tcctactgtg 
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tatttccgct tgaataccct agcctctcta ggttatgtgg ttgtagtgat agacaacagg 
2280 

ggatcctgtc accgagggct taaatttgaa ggcgccttta aatataaaat gggtcaaata 
2340 

gaaattgacg atcaggtgga aggactccaa tatctagctt ctcgatatga tttcattgac 
2400 

ttagatcgtg tgggcatcca cggctggtcc tatggaggat acctctccct gatggcatta 
24 60 

atgcagaggt cagatatctt cagggttgct attgctgggg ccccagtcac tctgtggatc 
2520 

ttctatgata caggatacac ggaacgttat atgggtcacc ctgaccagaa tgaacagggc 
2580 

tattacttag gatctgtggc catgcaagca gaaaagttcc cctctgaacc aaatcgttta 
2640 

ctgctcttac atggtttcct ggatgagaat gtccattttg cacataccag tatattactg 
2700 

agttttttag tgagggctgg aaagccatat gatttacaga tctatcctca ggagagacac 
2760 

agcataagag ttcctgaatc gggagaacat tatgaactgc atcttttgca ctaccttcaa 
2820 

gaaaaccttg gatcacgtat tgctgctcta aaagtgatat aattttgacc tgtgtagaac 
2880 

tctctggtat acactggcta tttaaccaaa tgaggaggtt taatcaacag aaaacacaga 
2940 

attgatcatc acattttgat acctgccatg taacatctac tcctgaaaat aaatgtggtg 
3000 

ccatgcaggg gtctacggtt tgtggtagta atctaatacc ttaaccccac atgctcaaaa 
• 3060 

tcaaatgata catattcctg agagacccag caataccata agaattacta aaaaaaaaaa 
3120 



<210> 6 

<211> 882 

<212> PRT 

<213> Homo sapiens 

<400> 6 
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Met Ala Ala Ala Met Glu Thr Glu Gin Leu Gly Val Glu He Phe Glu 

15 10 15 



Thr Ala Asp Cys Glu Glu Asn He Glu Ser Gin Asp Arg Pro Lys Leu 
20 25 30 



Glu Pro Phe Tyr Val Glu Arg Tyr Ser Trp Ser Gin Leu Lys Lys Leu 
35 40 45 



Leu Ala Asp Thr Arg Lys Tyr His Gly Tyr Met Met Ala Lys Ala Pro 
50 55 60 



His Asp Phe Met Phe Val Lys Arg Asn Asp Pro Asp Gly Pro His Ser 
65 , 70 ' 75 * " 80 



Asp Arg He Tyr Tyr Leu Ala Met Ser Gly Glu Asn Arg Glu Asn Thr 
85 90 95 



Leu Phe Tyr Ser Glu He Pro Lys Thr He Asn Arg Ala Ala Val Leu 
100 - 105 110 



Met Leu Ser Trp Lys Pro Leu Leu Asp Leu Phe Gin Ala Thr Leu Asp 
115 120 125 



Tyr Gly Met Tyr Ser Arg Glu Glu Glu Leu Leu Arg Glu Arg Lys Arg 
130 135 140 



He Gly Thr Val Gly He Ala Ser Tyr Asp Tyr His Gin Gly Ser Gly 
145 150 155 160 



Thr Phe Leu Phe Gin Ala Gly Ser Gly He Tyr His Val Lys Asp Gly 
165 170 175 



Gly Pro Gin Gly Phe Thr Gin Gin Pro Leu Arg Pro Asn Leu Val Glu 
180 185 ^ 190 



Thr Ser Cys Pro Asn He Arg Met Asp Pro Lys Leu Cys Pro Ala Asp 
195 200 205 
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Pro Asp Trp He Ala Phe He His Ser. Asn Asp He Trp He Ser Asn 
210 215 220 



He Val Thr Arg Glu Glu Arg Arg Leu Thr Tyr Val His Asn Glu Leu 
225 230 235 240 



Ala Asn Met Glu Glu Asp Ala Arg Ser Ala Gly Val Ala Thr Phe Val 
245 250 " 255 



Leu Gin Glu Glu Phe Asp Arg Tyr Ser Gly Tyr Trp Trp Cys Pro Lys 
260 265 " " 270 



Ala Glu Thr Thr Pro Ser Gly Gly Lys He Leu Arg lie Leu Tyr Glu 
275 280 285 



Glu Asn Asp Glu . Ser Glu Val Glu He He His Val Thr Ser Pro Met 
290 295 300 



Leu Glu Thr Arg Arg Ala Asp Ser Phe Arg Tyr Pro Lys Thr Gly Thr 
305 310 315 "* 320 



Ala Asn Pro Lys Val Thr Phe Lys Met Ser Glu He Met He Asp Ala 
325 330 335 



Glu Gly Arg He He Asp Val He Asp Lys Glu Leu He Gin Pro Phe 
340 345 350 



Glu He Leu Phe Glu Gly Val Glu Tyr Tie Ala Arg Ala Gly Trp Thr 
355 360 365 



Pro Glu Gly Lys Tyr Ala Trp Ser He Leu Leu Asp Arg Ser Gin Thr 
370 375 380 



Arg Leu Gin . He Val Leu He Ser Pro Glu Leu Phe He Pro Val Glu 
385 390 395 400 



Asp Asp Val Met Glu Arg Gin Arg Leu He Glu Ser Val Pro Asp Ser 
405 410 415 
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Val Thr. Pro Leu lie lie Tyr Glu Glu Thr Thr Asp lie Trp lie Asn 
420 425 * 430 



He His Asp He Phe .His Val Phe Pro • Gin Ser His Glu Glu Glu He 
435 440 445 



Glu Phe He Phe Ala Ser Glu Cys Lys Thr Gly Phe Arg His Leu Tyr 
450 ■ 455 460 



Lys He Thr Ser Ile^ Leu Lys Glu Ser Lys Tyr Lys Arg Ser Ser Gly 
465 470 475 480 



Gly Leu Pro Ala Pro Ser Asp Phe Lys . Cys Pro He Lys Glu Glu lie-. 

485 " 490 495 



Ala He Thr Ser Gly Glu Trp Glu Val Leu Gly Arg His Gly Ser. Asn 
500 505 ~ 510 



He Gin Val Asp Glu Val Arg Arg Leu Val Tyr Phe Glu Gly Thr Lys 
515 520 525 



Asp Ser Pro Leu Glu His His Leu Tyr Val Val Ser Tyr Val Asn Pro 
530 535 540 



Gly Glu Val Thr Arg Leu Thr Asp Arg Gly Tyr Ser His Ser Cys Cys 
545 550 555 560 



He Ser Gin His Cys Asp Phe Phe He Ser Lys Tyr Ser Asn Gin Lys 
565 570 575 



Asn Pro His Cys Val Ser Leu Tyr Lys Leu Ser Ser Pro Glu Asp Asp 
580 585 590 



Pro Thr Cys Lys Thr Lys Glu Phe Trp Ala Thr lie Leu Asp Ser .Ala 
595 600 605 



Gly Pro Leu Pro Asp Tyr Thr Pro Pro Glu He Phe Ser Phe Glu Ser 
610 615 620 
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Thr Thr Gly Phe Thr Leu Tyr Gly Met Leu Tyr Lys Pro His Asp Leu 
625 630 635 640 



Gin Pro Gly Lys Lys Tyr Pro Thr Val Leu Phe lie Tyr Gly Gly Pro 
645 650 ~ 655 



Gin Val Gin Leu Val Asn Asn Arg Phe Lys Gly Val Lys Tyr Phe Arg 
660 665 " " " 670 



Leu Asn Thr Leu Ala Ser Leu Gly Tyr Val Val Val Val He Asp Asn 
675 680 685 



Arg Gly Ser Cys His Arg Gly Leu Lys Phe Glu Gly Ala Phe Lys Tyr 
690 695 700 



Lys Met Gly Gin He Glu He Asp Asp Gin Val Glu Gly Leu Gin Tyr 
705 710 715 720 



Leu Ala Ser Arg Tyr Asp Phe He Asp Leu Asp Arg Val Gly He His 
725 730 735 



Gly Trp Ser Tyr Gly Gly Tyr Leu Ser Leu Met Ala Leu Met Gin Arg 
740 745 750 



Ser Asp He Phe Arg Val Ala He Ala Gly Ala Pro Val Thr Leu Trp 
755 760 765 



He Phe Tyr Asp Thr Gly Tyr Thr Glu Arg Tyr Met Gly His Pro Asp 
770 775 780 



Gin Asn Glu Gin Gly Tyr Tyr Leu Gly Ser Val Ala Met Gin Ala Glu 
785 790 795 800 



Lys Phe Pro Ser Glu, Pro Asn Arg Leu Leu Leu Leu His Gly Phe Leu 
805 810 815 



Asp Glu Asn Val His Phe Ala His Thr Ser He Leu Leu Ser Phe Leu 
820 825 830 
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Val Arg Ala Gly Lys Pro Tyr Asp Leu Gin lie Tyr Pro Gin Glu Arg 
835 840 * 845 



His Ser lie Arg Val Pro Glu Ser Gly Glu His Tyr Glu Leu His Leu 
850 855 860 



Leu His Tyr Leu Gin Glu Asn Leu Gly Ser Arg lie Ala Ala Leu Lys 
865 870 " 875 880 



Val He 



<210> 7 

<211> 830 

<212> PRT 

<213> Homo sapiens 

<400> 7 

Leu Arg Ser He He His Gly Ser Arg Lys Tyr Ser Gly Leu He Val 
1 5 10 15 



Asn Lys Ala Pro His Asp Phe Gin Phe Val Gin Lys Thr Asp Glu Ser 
20 25 * 30 



Gly Pro His Ser His Arg Leu Tyr Tyr Leu Gly Met Pro Tyr Gly Ser 
35 40 45 



Arg Glu Asn Ser Leu Leu Tyr Ser Glu He Pro Lys Lys Val Arg Lys 
50 55 60 



Glu Ala Leu Leu Leu Leu Ser Trp Lys Gin Met Leu Asp His Phe Gin 
65 70 75 80 



Ala Thr Pro His His Gly Val Tyr Ser Arg Glu Glu Glu Leu Leu Arg 
85 90 95 



Glu Arg Lys Arg Leu Gly Val Phe Gly He Thr Ser Tyr Asp Phe His 
100 105 110 
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Ser Glu Ser Gly Leu Phe Leu Phe Gin Ala Ser Asn Ser Leu Phe His 
115 120 125 



Cys Arg Asp Gly Gly Lys Asn Gly Phe Met Val Ser Pro Met Lys Pro 
130 135 140 



Leu Glu He Lys Thr Gin Cys Ser Gly Pro Arg Met Asp Pro Lys He 
145 150 155 160 



Cys Pro Ala Asp Pro Ala Phe Phe Ser Phe Asn Asn Asn Ser Asp Leu 
165 170 175 



Trp Val Ala Asn He Glu Thr Gly Glu Glu Arg Arg Leu Thr Phe Cys 
180 185 190 



His Gin Gly Leu Ser Asn Val Leu Asp Asp Pro Lys Ser Ala Gly Val 
195 200 * 205 



Ala Thr Phe Val He Gin Glu Glu Phe Asp Arg Phe Thr Gly Tyr Trp 
210 215 220 



Trp Cys Pro Thr Ala Ser Trp Glu Gly Ser Gin Gly Leu Lys Thr Leu 
225 230 235 " 240 



Arg He Leu Tyr Glu Glu Val Asp Glu Ser Glu Val Glu Val He His 
245 250 255 



Val Pro Ser Pro Ala Leu Glu Glu Arg Lys Thr Asp Ser Tyr Arg Tyr 
260 265 * * 270 



Pro Arg Thr Gly Ser Lys Asn Pro Lys He Ala Leu Lys Leu Ala Glu 
275 280 285 



Phe Gin Thr Asp' Ser Gin Gly Lys lie Val Ser Thr Gin Glu Lys Glu 
290 295 300 



Leu Val Gin Pro Phe Ser Ser Leu Phe Pro Lys Val Glu Tyr He Ala 
305 310 315 ' 320 
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Arg Ala Gly Trp Thr Arg Asp Gly Lys Tyr Ala Trp Ala Met Phe Leu 
325 330 335 



Asp Arg Pro Gin Gin Trp Leu Gin Leu Val Leu Leu Pro Pro Ala Leu 
340 345 350 



Phe He Pro Ser Thr Glu Asn Glu Glu Gin Arg Leu Ala Ser Ala Arg 
355 360 365 



Ala Val Pro Arg Asn Val Gin Pro Tyr Val Val Tyr Glu Glu Val Thr 
370 375 380 



Asn Val Trp He Asn Val His Asp He Phe Tyr Pro Phe Pro Gin Ser 
385 390 395 400 



Glu Gly Glu Asp Glu Leu Cys Phe Leu Arg Ala Asn Glu Cys Lys Thr 
405 410 415 



Gly Phe Cys His Leu Tyr Lys Val Thr Ala Val Leu Lys Ser Gin Gly 
420 425 430 



Tyr Asp Trp Ser Glu Pro Phe Ser Pro Gly Glu Asp Glu Phe Lys Cys 
435 440 445 



Pro He Lys Glu Glu He Ala Leu Thr Ser Gly Glu Trp Glu Val Leu 
450 455 460 



Ala Arg His Gly Ser Lys He Trp Val Asn Glu Glu Thr Lys Leu Val 
465 470 475 ~ 480 



Tyr Phe Gin Gly Thr Lys Asp Thr Pro Leu Glu His His Leu Tyr Val 
485 490 495 



Val Ser Tyr Glu Ala Ala Gly Glu He Val Arg Leu Thr Thr Pro Gly 
500 505 510 



Phe Ser His Ser Cys Ser Met Ser Gin Asn Phe Asp Met Phe Val Ser 
515 520 525 
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His Tyr Ser Ser Val Ser Thr Pro Pro Cys Val His Val Tyr Lys Leu 
530 535 540 



Ser Gly Pro Asp Asp Asp Pro Leu His Lys Gin Pro Arg Phe Trp Ala 
545 550 555 " 560 



Ser Met Met Glu Ala Ala Ser Cys Pro Pro Asp Tyr Val Pro Pro Glu 
565 570 " 575 



He Phe His Phe His Thr Arg Ser Asp Val Arg Leu Tyr Gly Met He 
580 585 " 590 



Tyr Lys Pro His Ala Leu Gin Pro Gly Lys Lys His Pro Thr Val Leu 
595 600 605 



Phe Val Tyr Gly Gly Pro Gin Val Gin Leu Val Asn Asn Ser Phe Lys 
610 615 620 



Gly He Lys Tyr Leu Arg Leu Asn Thr Leu Ala Ser Leu Gly Tyr Ala 
625 630 635 640 



Val Val Val He Asp Gly Arg Gly Ser Cys Gin Arg Gly Leu Arg Phe 
645 650 ' ~ 655 



Glu Gly Ala Leu Lys Asn Gin Met Gly Gin Val Glu He Glu Asp Gin 
660 665 670 



Val Glu Gly Leu Gin Phe Val Ala Glu Lys Tyr Gly Phe He Asp Leu 
675 680 ^ 685 



Ser Arg Val Ala He His Gly Trp Ser Tyr Gly Gly Phe Leu Ser Leu 
690 695 700 



Met Gly Leu He His Lys Pro Gin Val Phe Lys Val Ala He Ala Gly 
705 710 715 720 



Ala Pro Val Thr Val Trp Met Ala Tyr Asp Thr Gly Tyr Thr Glu Arg 
725 730 " 735 
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Tyr Met Asp Val Pro Glu Asn Asn Gin His Gly Tyr Glu Ala Gly Ser 
740 745 750 



Val Ala Leu His Val Glu Lys Leu Pro Asn Glu Pro Asn Arg Leu Leu 
755 760 765 



He Leu His Gly Phe Leu Asp Glu Asn Val His Phe Phe His Thr Asn 
770 775 780 



Phe Leu Val Ser Gin Leu He Arg Ala Gly Lys Pro Tyr Gin Leu Gin 
785 790 795 800 



He Tyr Pro Asn Glu Arg His Ser He Arg Cys Pro Glu Ser Gly Glu 
805 810 815 



His Tyr Glu Val Thr Leu Leu His Phe Leu Gin Glu Tyr Leu 
820 825 830 



<210> 8 

<211> 2495 

<212> DNA 

<213> Homo sapiens 

<400> .8 

ctccggagca tcatccacgg cagccgcaag tactcgggcc tcattgtcaa caaggcgccc 
60 

cacgacttcc agtttgtgca gaagacggat gagtctgggc cccactccca ccgcctctac 
120 

tacctgggaa tgccatatgg cagccgggag aactccctcc tctactctga gattcccaag 
180 

aaggtccgga aagaggctct gctgctcctg tcctggaagc agatgctgga tcatttccag 
240 

gccacgcccc accatggggt ctactctcgg gaggaggagc tgctgaggga gcggaaacgc 

ctgggggtct tcggcatcac ctcctacgac ttccacagcg agagtggcct cttcctcttc 
360 

caggccagca acagcctctt ccactgccgc gacggcggca agaacggctt catggtgtcc 
420 

cctatgaaac cgctggaaat caagacccag tgctcagggc cccggatgga ccccaaaatc 
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480 



tgccctgccg accctgcctt 
540 



atcgagacag gcgaggagcg 
600 



gatgacccca agtctgcggg 
660 



actgggtact ggtggtgccc 
720 

cgaatcctgt atgaggaagt 
780 



gcgctagaag aaaggaagac 
840 



aagattgcct tgaaactggc 
900 



caggagaagg agctggtgca 
960 



agggccgggt ggacccggga 
1020 



cagtggctcc agctcgtcct 
1080 



gagcagcggc tagcctctgc 
1140 



gaggaggtca ccaacgtctg 
1200 



gagggagagg acgagctctg 
1260 

ttgtacaaag tcaccgccgt 
1320 



cccggggaag atgaatttaa 
1380 



tgggaggttt tggcgaggca 
1440 



tacttccagg gcaccaagga 
1500 
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cttctccttc aacaataaca 

gcggctgacc ttctgccacc 

tgtggccacc ttcgtcatac 

cacagcctcc tgggaaggtt 

cgatgagtcc gaggtggagg 

ggactcgtat cggtacccca 

tgagttccag actgacagcc 

gcccttcagc tcgctgttcc 

tggcaaatac gcctgggcca 

cctccccccg gccctgttca 

cagagctgtc cccaggaatg 

gatcaatgtt catgacatct 

ctttctccgc gccaatgaat 

tttaaaatcc cagggctacg 

gtgccccatt aaggaagaga 

cggctccaag atctgggtca 

cacgccgctg gagcaccacc 
Paqe 29 



PCT/AU01/01388 

gcgacctgtg ggtggccaac 
aaggtttatc caatgtcctg 
aggaagagtt cgaccgcttc 
cagagggcct caagacgctg 
tcattcacgt cccctctcct 
ggacaggcag caagaatccc 
agggcaagat cgtctcgacc 
cgaaggtgga gtacatcgcc 
tgttcctgga ccggccccag 
tcccgagcac agagaatgag 
tccagccgta tgtggtgtac 
tctatccctt cccccaatca 
gcaagaccgg cttctgccat 
attggagtga gcccttcagc 
ttgctctgac cagcggtgaa 
atgaggagac caagctggtg 
tctacgtggt cagctatgag 
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gcggccggcg agatcgtacg cctcaccacg cccggcttct cccatagctg ctccatgagc 
1560 

cagaacttcg acatgttcgt cagccactac agcagcgtga gcacgccgcc ctgcgtgcac 
1620 

gtctacaagc tgagcggccc cgacgacgac cccctgcaca agcagccccg cttctgggct 
1680 

agcatgatgg aggcagccag ctgccccccg gattatgttc ctccagagat cttccatttc 
1740 

cacacgcgct cggatgtgcg gctctacggc atgatctaca agccccacgc cttgcagcca 
1800 

gggaagaagc accccaccgt cctctttgta tatggaggcc cccaggtgca gctggtgaat 
18 60 

aactccttca aaggcatcaa gtacttgcgg ctcaacacac tggcctccct gggctacgcc 
1920 

gtggttgtga ttgacggcag gggctcctgt cagcgagggc ttcggttcga aggggccctg 
1980 

aaaaaccaaa tgggccaggt ggagatcgag gaccaggtgg agggcctgca gttcgtggcc 
2040 

gagaagtatg gcttcatcga cctgagccga gttgccatcc atggctggtc ctacgggggc 
2100 

ttcctctcgc tcatggggct aatccacaag ccccaggtgt tcaaggtggc catcgcgggt 
2160 

gccccggtca ccgtctggat ggcctacgac acagggtaca ctgagcgcta catggacgtc 
2220 

cctgagaaca accagcacgg ctatgaggcg ggttccgtgg ccctgcacgt ggagaagctg 
2280 

cccaatgagc ccaaccgctt gcttatcctc cacggcttcc tggacgaaaa cgtgcacttt 
2340 

ttccacacaa acttcctcgt ctcccaactg atccgagcag ggaaacctta ccagctccag 
2400 

atctacccca acgagagaca cagtattcgc tgccccgagt ' cgggcgagca ctatgaagtc 
24 60 

acgttactgc actttctaca ggaatacctc tgagc 
2495 
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